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(57) Abstract: A family of fatty acid transport proteins (FATPs) mediate transport of long chain fatty neids (LCFAs) across cell 
^1 membranes into cells. These proteins exhibit different expression patterns among the organs of mammals. Nucleic acids encoding 
\ FATPs of this family, vectors comprising these nucleic acids, as well as the production of FATP proteins in host cells are described. 
^ Also described are methods to test FATPs for fatty acid transport function, and methods to identify inhibitors or enhancers of transport 

function. The altering of LCFA uptake by administering to the mamma] an inhibitor orenhancer of F/vFP transport function of a 

FATP in the small intestine can decrease or increase c^Jories available as fats, and caadecrease or increase circulating fatty acids. 

The organ specificity of FATP distribution can be exploited jn methods to direct drugs, diagnostic .indicators and so forth to an organ 
^ such as the heart. 
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FATTY ACID TRANSPORT PROTEINS 

RELATED APPLICATION(S) 

This application is a continuation-in-part of U.S. Patent Application 
Number 09/506,252 filed February 17, 2000 which is a continuation-in-part of U.S. 
5 Patent Application Number 09/465,280 filed December 16, 1999 which is a 

continuation-in-part of U.S. Patent Application Number 09/405,505 filed September 
23, 1999, and is a continuation-in-part of U.S. Patent Application Number 
09/232,195 filed January 14, 1999, both of which claim the benefit of U.S. 
Provisional Application Number 60/1 10,941 filed December 4, 1998; U.S. 

10 Provisional Application Number 60/093,491 filed July 20, 1998; and U.S. 
Provisional Application Number 60/071,374 filed January 15, 1998. This 
application is also a continuation-in-part of U.S. Patent Application Number 
09/405,504 filed September 23, 1999, which is a continuation-in-part of U.S. Patent 
Application Number 09/232,201 filed January 14, 1999, which claims the benefit of 

15 U.S. Provisional Application Number 60/1 10,941 filed December 4, 1998; U.S. 
Provisional Application Number 60/093,491 filed July 20, 1998; and U.S. 
Provisional Application Number 60/071,374 filed January 15, 1998. This 
application is also a continuation-in-part of U.S. Patent Application Number 
09/232,197 filed January 14, 1999, United States Patent Application Number 

20 09/232,200 filed January 14, 1999 and International Application Number 

PCT/US99/00182 filed January 14, 1999, each of which claims the benefit of U.S. 
Provisional Application Number 60/1 10,941 filed December 4, 1998; U.S. 
Provisional Application Number 60/093,491 filed July 20, 1998; and U.S. 
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Provisional Application Number 60/071,374 filed January 15, 1998. The teachings 
of each of these referenced applications are incorporated herein by reference in their 
entirety. 

GOVERNMENT SUPPORT 
5 The invention was supported, in whole or in part, by a grant from the 

National Heart, Lung, and Blood Institute (HL41484), by National Institutes of 
Health Grant DK 47618 and National Institutes of Health Grant 5 T32 CA 09541. 
The United States Government has certain rights in the invention. 

BACKGROUND OF THE INVENTION 

10 Long chain fatty acids (LCFAs) are an important source of energy for most 

organisms. They also function as blood hormones, regulating key metabolic 
functions such as hepatic glucose production. Although LCFAs can diffuse through 
the hydrophobic core of the plasma membrane into cells, this nonspecific transport 
cannot account for the high affinity and specific transport of LCFAs exhibited by 

1 5 cells such as cardiac muscle, hepatocytes, enterocytes, and adipocytes. The 

molecular mechanisms of LCFA transport remains largely unknown. Identifying 
these mechanisms can lead to pharmaceuticals that modulate fatty acid uptake by the 
intestine and by other organs, thereby alleviating certain medical conditions (e.g. 
obesity). 

20 SUMMARY OF THE INVENTION 

Described herein is a diverse family of fatty acid transport proteins (FATPs) 
which are evolutionarily conserved; these FATPs are plasma membrane proteins 
which mediate transport of LCFAs across the membranes and into cells. Members 
of the FATP family described herein are present in a wide variety of organisms, from 

25 mycobacteria to humans, and exhibit very different expression patterns in tissues 
among the organisms. FATP family members are expressed in prokaryotic and 
eukaryotic organisms and comprise characteristic amino acid domains or sequences 
which are highly conserved across family members. In addition, the function of the 
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FATP gene family is conserved throughout evolution, as shown by the fact that the 
Caenorhabditis (C). elegans and mycobacterial FATPs described herein facilitate 
LCFA uptake when they are overexpressed in COS cells or Escherichia (E.J coli, 
respectively. FATPs are expressed in a wide variety of tissues, including all tissues 
5 which are important to fatty acid metabolism (uptake and processing). 

In specific embodiments, FATPs of the present invention are from such 
diverse organisms as humans (Homo (H.J sapiens), mice, (Mus (M.) musculus), F. 
rubripes, C. elegans, Drosophila (D.J melanogaster, Saccharomyces (S.J cerevisiae, 
Aspergillus nidulans, Cochliobolu heterostrophus, Magnaporthe grisea and 

10 Mycobacterium (M.) % such as M. tuberculosis. As described herein, four novel 
mouse FATPs, referred to as mmFATP2, mmFATP3, mmFATP4 and mmFATPS, 
and six human FATPs, referred to as hsFATPl, hsFATP2, hsFATP3, hsFATP4, 
hsFATPS and hsFATP6, have been identified. All four novel murine FATPs 
(mmFATP2-5) and a previously identified murine FATP (renamed herein FATP1) 

15 have orthologs in humans (hsFATPl-5); the sixth human FATP (hsFATP6) does not 
as yet have a mouse ortholog. The expression patterns of these FATPs vary, as 
described in detail below. 

The present invention relates to FATP family members from prokaryotes and 
eukaryotes, nucleic acids (DNA, RNA) encoding FATPs, and nucleic acids which 

20 are useful as probes or primers (e.g., for use in hybridization methods, amplification 
methods) for example, in methods of detecting FATP-encoding genes, producing 
FATPs, and purifying or isolating FATP-encoding DNA or RNA, Also the subject 
of this invention are antibodies (polyclonal or monoclonal) which bind an FATP or 
FATPs; methods of identifying additional FATP family members (for example, 

25 orthologs of those FATPs described herein by amino acid sequence) and variant 
alleles of known FATP genes; methods of identifying compounds which bind to an 
FATP, or modulate or alter (enhance or inhibit) FATP function; compounds which 
modulate or alter FATP function; methods of modulating or altering (enhancing or 
inhibiting) FATP function and, thus, LCFA uptake into tissues of a mammal (e.g. 

30 human) by administering a compound or molecule (a drug or agent) which increases 
or reduces FATP activity; and methods of targeting compounds to tissues by 
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administering a complex of the compound to be targeted to tissues and a component 
which is bound by an FATP present on cells of the tissues to which the compound is 
to be targeted. For example, a complex of a drug to be delivered to the liver and a 
component which is bound by an FATP present on liver cells (e.g., FATP5) can be 
5 administered. 

In one embodiment, the present invention relates to modulating or altering 
(enhancing or inhibiting/reducing) LCFA uptake in the small intestine and, thus, 
increasing or reducing the number of calories in the form of fats available to an 
individual. In another embodiment, the present invention relates to inhibiting or 

10 reducing LCFA uptake in the small intestine in order to reduce circulating fatty acid 
levels; that is, LCFA uptake in the small intestine is reduced and, therefore, 
circulating (blood) levels are not as high as they otherwise would be. FATP4 has 
been shown to be expressed in epithelial cells of the small intestine and particularly 
in the brush border layer of the small intestine, FATP2 has also been shown to be 

15 expressed at low levels in epithelial cells of the small intestine, particularly in the 
duodenum. In contrast, FATP1, FATP3, FATP5 and FATP6 were not detected in 
any of the intestinal tissues. Thus, also described herein are FATPs which are 
present in the epithelial cell layer of the small intestine where they mediate LCFA 
uptake. These FATPs, particularly FATP4 and also FATP2, are targets for methods 

20 and drugs which block their function or activity and are useful in treating obesity, 
diabetes and heart disease. The ability of these FATPs to mediate fat uptake can be 
modulated or altered (enhanced or inhibited), thus modulating fat uptake in the small 
intestine. This can be done, for example, by administering to an individual, such as 
a human or other animal, a drug which blocks interaction of LCFAs with FATP4 

25 and/or FATP2 in the small intestine, thus inhibiting LCFA passage into the cells of 
the small intestine. As a result, fat absorption is reduced and, although the 
individual has consumed a certain quantity of fat, the LCFAs are not absorbed to the 
same extent they would have been in the absence of the compound administered. 
Thus, one embodiment of this invention is a method of reducing LCFA 

30 uptake (absorption) in the small intestine and, as a result, reducing caloric uptake in 
the form of fat. A further embodiment is a compound (drug) useful in inhibiting or 
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reducing fat absorption in the small intestine. In another embodiment, the invention 
is a method of reducing circulating fatty acid levels by administering to an individual 
a compound which blocks interactions of LCFAs with FATP4 and/or FATP2 in the 
small intestine, thus inhibiting LCFA passage into cells of the small intestine. As a 
5 result, fatty acids pass into the circulatory system at a diminished level and/or rate, 
and circulating fatty acid levels are lower than they would be in the absence of the 
compound administered. This method is particularly useful for therapy in 
individuals who are at risk for or have hyperlipidemia. That is, it can be used to 
prevent the occurrence of elevated levels of lipids in the blood or to treat an 

10 individual in whom blood lipid levels are elevated. Also the subject of this 

invention is a method of identifying compounds which alter FATP function (and 
thus, in the case of FATP2 and/or FATP4, alter LCFA uptake in the small intestine). 

In another embodiment, the present invention relates to a method of 
modulating or altering (enhancing or inhibiting) the function of FATP6, which is 

15 expressed at high levels in the heart. A method of inhibiting FATP6 function is 
useful, for example, in individuals with heart disease, such as ischemia, since 
reducing LCFA uptake into heart muscle in an individual who has ischemic heart 
disease, which may be manifested by, for example, angina or heart attack, can reduce 
symptoms or reduce the extent of damage caused by the ischemia. In this 

20 embodiment, a drug which inhibits FATP6 function is administered to an individual 
who has had or is having a heart attack, to reduce LCFA uptake by the individual's - 
heart and, as a result, reduce the damage caused by ischemia. In a further 
embodiment, this invention is a method of targeting a compound, such as a 
therapeutic drug or an imaging reagent, to heart tissue by administering to an 

25 individual (e.g., a human) a complex of the compound and a component (e.g., a 

LCFA or LCFA-like compound) which is bound by an FATP (e.g., FATP6) present 
in cells of heart tissue. 

In a further embodiment, LCFA uptake by the liver is modulated or altered 
(enhanced or reduced), in an individual. For example, a drug which inhibits the 

30 function of an FATP present in liver (e.g., FATP5) is administered to an individual 
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who is diabetic, in order to reduce LCFA uptake by liver cells and, thus reduce 
insulin resistance. 

The present invention, thus, provides methods which are useful to alter, 
particularly reduce, LCFA uptake in individuals and, as a result, to alter (particularly 
5 reduce), availability of the LCF As for further metabolism. In a specific 

embodiment, the present invention provides methods useful to reduce LCFA uptake 
and, thus, fatty acid metabolism in individuals, with the result that caloric 
availability from fats is reduced, and circulating fatty acid levels are lower than they 
otherwise would be. These methods are useful, for example, as a means of weight 

10 control in individuals, (e.g., humans) and as a means of preventing elevated serum 
lipid levels or reducing serum lipid levels in humans. FATPs expressed in the small 
intestine, such as FATP4, are useful targets to be blocked in treating obesity (e.g., 
chronic obesity) or to be enhanced in treating conditions in which enhanced LCFA 
uptake is desired (e.g., malabsorption syndrome or other wasting conditions). 

15 The identification of this evolutionarily conserved fatty acid transporter 

family will allow a better understanding of the mechanisms whereby LCFAs traverse 
the lipid bilayer as well as yield insight into the control of energy homeostasis and 
its dysregulation in diseases such as diabetes and obesity. 

BRIEF DESCRIPTION OF THE DRAWINGS 

20 The file of this patent contains at least one color photograph. Copies of this 

patent with color photographs will be provided by the Patent and Trademark Office 
upon request and payment of necessary fee. 

Figure 1 shows the amino acid sequence alignment of FATPs: mmFATPl 
(SEQ ID NO:92), mmFATP2 (SEQ ID NO:93), mmFATP3 (SEQ ID NO:94), 

25 mmFATP4 (SEQ ID NO:95), mmFATP5 (SEQ ID NO:96), ceFATPa (SEQ ID 

NO:97), scFATP (SEQ ID NO:98) and mtFATP (SEQ ID NO:99). The underlining 
(amino acid residues 204-212 of mtFATP) indicates an AMP binding motif which is 
found in many classes of proteins; the underlining at amino acid residues 204-507 of 
the mtFATP sequence indicates the FA TP 360 amino acid signature sequence. 
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Figures 2 A-2D show results of LCF A uptake assays. Figures 2A-2D: COS 
cells were cotransfected using the DEAE-dextran method with the mammalian 
expression vectors pCDNA-CD2 either alone (control; Figure 2A) or in combination 
with one of the FATP-containing expression vectors (pCDNA-mmFA TP 1, Figure 
5 2B; pCDNA-mmFATP2, Figure 2C; or pCMV-SPORT2-mmFATP5, Figure 2D) as 
described in Materials and Methods for Example 2. COS cells were gated on 
forward scatter (FSC) and side scatter (SS), and the results shown represent >1 0,000 
cells. Cells exhibiting >300 CD2 fluorescence units (vertical line) representing 15% 
of all cells were deemed CD2 positive. 

10 Figure 3 is a graph of fluorescence of cells expressing a FATP gene. As in . 

Figures 2A-2D, COS cells were cotransfected with pCDNA-CD2 either alone 
(control) or in combination with one of the FATP-containing expression vectors 
(pCDNA-mrnFATPl , pCDNA-mmFATP2 , pCMV-SPORT2-mmFATP5, or 
pCDNA-ceFATPb). The mean BODEPY-FA fluorescence of the CD2 -positive cells 

15 is plotted; results shown represent the average of three experiments, each consisting 
of greater than 50,000 COS cells. Note that a logarithmic scale is used on the 
ordinate. 

Figure 4 is a graph of the uptake of palmitate with time. The full-length 
coding region of mtFATP (squares) or a control protein (TFE3; circles) was 

20 subcloned into the inducible, prokaryotic expression vector pET (Novagen, 

Madison, WI). Expression from the resulting plasmid was induced (solid symbols) 
in transformed E. coli cells with 1 mM isopropyl-P-D-thiogalactoside (IPTG) for 1 
hour, or cells were left uninduced (open symbols). Data points were done in 
triplicate and counts were normalized to the number of bacteria as determined by 

25 OD 600 . 

Figure 5 is a phylogenetic tree produced by aligning complete and partial 
sequences for FATP genes from human, rat, mouse, puffer fish, D. melanogaster, C. 
elegans, S. cerevisiae, and M tuberculosis using ClustaLX and using these data to 
produce a phylogenetic tree using TreeViewPPC. The bar indicates the number of 
30 substitutions per residue, i.e., 0.1 corresponds to a distance of 10 substitutions per 
100 residues. 
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Figure 6 shows a comparison of the FATP signature sequences of mmFATPl 
(SEQ ID NO:l), mmFATP5, (SEQ ID NO:2), ceFATPa (SEQ ID NO:3), scFATP 
(SEQ ID NO:4) and mtFATP (SEQ ID NO:5). 

Figure 7 shows the sequence identity among the FATP family members and 
5 VLACs, based on the 360 amino acid signature sequence of FATP from Figure 1 . 

Figures 8A and 8B are the mmFATP3 DNA sequence (SEQ ED NO:6). 

Figure 9 is the mmFATP3 protein sequence (SEQ ID NO:7). 

Figures 1 OA and 1 OB are the mmFATP4 DNA sequence (SEQ ED NO:8). 

Figure 1 1 is the mmFATP4 protein sequence (SEQ ID NO:9). 
10 Figures 12A and 12B are the mmFATP5 DNA sequence (SEQ ID NO: 10). 

Figure 13 is the mmFATP5 protein sequence (SEQ ID NO:l 1). 

Figures 14A and 14B are the hsFATP2 DNA sequence (SEQ ID NO: 12). 

Figure 15 is the hsFATP2 protein sequence (SEQ ED NO: 13). 

Figures 16A and 16B are the hsFATP3 DNA sequence (SEQ ID NO:14). 
15 Figure 17 is the hsFATP3 protein sequence (SEQ ID NO: 15). 

Figures 18A and 18B are the hsFATP4 DNA sequence (SEQ ED NO: 16). 

Figure 19 is the hsFATP4 protein sequence (SEQ ED NO: 17). 

Figures 20A and 20B are the hsFATP5 DNA sequence (SEQ ID NO: 18). - 

Figure 21 is the hsFATP5 protein sequence (SEQ ID NO: 19). 
20 Figures 22A and 22B are the hsFATP6 DNA sequence (SEQ ID NO:20). 

Figure 23 is the hsFATP6 protein sequence (SEQ ED NO:21). 

Figures 24A and 24B are the mtFATP DNA sequence (SEQ ID NO:22). 

Figure 25 is the mtFATP protein sequence (SEQ ID NO:23). 

Figure 26 shows the DNA sequence (SEQ ID NO:24) and predicted amino 
25 acid sequence (SEQ ID NO:25) of human FATP1. 

Figure 27 shows the DNA sequence (SEQ ID NO:26) and predicted amino 
acid sequence (SEQ ID NO:27) of human FATP4. 

Figure 28A is a hydrophobicity plot for hsF ATP 1, showing that it has 
multiple membrane-spanning domains. 
30 Figure 28B is the amino acid composition of hsFATPl. 
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Figure 28C is a hydrophilicity plot for hsFATPl, made using the Kyte- 
Doolittle method, averaging hydrophilicity values for 18 amino acid residues at a 
time. 

Figure 29 A is a hydrophobicity plot for hsFATP4, showing that it has 
5 multiple membrane-spanning domains. 

Figure 29B is a listing of the amino acid composition of hsFATP4. 

Figure 29C is a hydrophilicity plot for hsFATP4, made using the Kyte- 
Doolittle method, averaging hydrophilicity values for 18 amino acid residues at a 
time. 

10 Figures 30A and 30B show a comparison of the nucleotide sequence of 

human FATP1 (SEQ ID NO:28) and the nucleotide sequence of mouse FATP1 
(SEQ ID NO:29). 

Figures 31 A and 3 IB show a comparison of the nucleotide sequence of 
human FATP4 (SEQ ID NO;30) and the nucleotide sequence of mouse FATP4 
15 (SEQIDNO:31). 

Figure 32 shows a comparison of the amino acid sequence of human FATP1 
(SEQ ID NO:32) and the amino acid sequence of mouse FATP1 (SEQ 3D NO:33). 
Shaded amino acid residues match the consensus sequence exactly. 

Figure 33 shows a comparison at the amino acid level of human FATP4 
20 (SEQ ID NO:34) and mouse FATP4 (SEQ ID NO:35). Shaded amino acid residues 
match the consensus sequence exactly. 

Figure 34 shows the nucleotide sequence (SEQ ED NO:36) and predicted 
amino acid sequence (SEQ ID NO:37) of hsFATP6. 

Figure 35A is a hydrophobicity plot for hsFATP6, showing that it has 
25 multiple membrane-spanning domains. 

Figure 35B is a listing of the amino acid composition of hsFATP6. 

Figure 35C is a hydrophilicity plot for hsFATP6, made using the Kyte- 
Doolittle method, averaging hydrophilicity values for 18 amino acid residues at a 
time. 
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Figure 36 shows an alignment of the amino acid sequences of hsFATPl 
(SEQ ID NO:38), hsFATP4 (SEQ ID NO:39) and hsFATP6 (SEQ ID NO:40). 
Shaded amino acid residues match the consensus sequence exactly. 

Figure 37 shows results of assessment of fatty acid uptake by human FATP1 
5 and human FATP4. The percent of CD2-positive cells exhibiting a BODIPY- 
fluorescence of more than 300 arbitrary units is plotted for the three different 
conditions tested. 

Figure 38 is a graph showing uptake of tritiated oleate, with time, by 293 
cells transfected with either (diamonds) a plasmid for expression of human FATP4 
10 or (squares) a control plasmid. 

Figure 39 is an illustration of the amino acid sequences of human FATP4 
(SEQ ID NO:41) and mouse FATP4 (SEQ ID NO:42) compared to human FATP1 
(SEQ ID NO:43). Shown by underlining are the FATP consensus sequence (236- 
556 of hsFATPl) and the AMP-binding motif (246-254 of hsFATPl). The human 
15 FATPs were cloned by screening libraries with sequences from ESTs (expressed 
sequence tags). Mouse FATP4 was cloned by PCR using degenerate primers. 

Figure 40 is a graph showing the uptake, with time, of tritiated oleate by 
mouse enterocytes in the presence of no oligonucleotide (squares), sense 
oligonucleotide (circles) or antisense oligonucleotide (diamonds). 
20 Figure 41 is a bar graph showing uptake of tritiated oleate, by mouse 

enterocytes in the presence of various concentrations of antisense (solid bars), 
mismatch (stippled bars) or sense (lined bars) oligonucleotides. 

Figure 42 is a bar graph showing uptake of tritiated oleate and uptake of 35 S- 
labeled methionine by mouse enterocytes to which were added no oligonucleotide, 
25 the antisense oligonucleotide, or the mismatch oligonucleotide. 

Figure 43A is the nucleotide sequence of the gene encoding mouse FATP4 
(SEQ ID NO:44). 

Figure 43B is the amino acid sequence of mouse FATP4 protein (SEQ ED 
NO:45). 

30 Figures 44A, 44B, and 44C are the hsFATPl DNA sequence (SEQ ID 

NO:46). Coding region: 175-2115 (1941 nt). 
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Figure 45 is the hsFATPl protein sequence (SEQ ID NO:47). 

Figures 46A and 46B are the hsFATP2 DNA sequence (SEQ ID NO:48). 
Coding region: 223-2085 (1863 nt). 

Figure 47 is the hsFATP2 protein sequence (SEQ ID NO:49). 
5 Figure 48 is the partial DNA sequence of hsFATP3 (SEQ ID NO:50). 

Coding region: 1-993. 

Figure 49 is the partial protein sequence of hsFATP3 (SEQ ID NO:51). 

Figures 50A, 50B, and 50C are the hsFATP4 DNA sequence (SEQ ID 
NO:52). Coding region: 208-2139 (1932 nt). 
1 0 Figure 5 1 is the hsFATP4 protein sequence (SEQ ID NO:53). 

Figure 52 is the hsFATPS partial DNA sequence (SEQ ID NO:54). Coding 
region: 1-1062. 

Figure 53 is the hsFATPS partial protein sequence (SEQ ID NO:55). 
Figures 54A, 54B, and 54C are the hsFATP6 DNA sequence (SEQ ID 
15 NO:56). Coding region: 643-2502 (1860 nt). 

Figure 55 is the hsFATP6 protein sequence (SEQ ID NO:57). 
Figures 56A, 56B, and 56C are the rnFATPl DNA sequence {rn—Rattus 
norvegicus; (SEQ ID NO:58). Coding region: 75-2015 (1941 nt). 

Figure 57 is the rnFATPl protein sequence (SEQ ID NO:59). 
20 Figures 58A, 58B, and 58C are the rnFATP2 DNA sequence (SEQ ID 

NO:60). Coding region: 795-2657 (1863 nt). 

Figure 59 is the mFATP2 protein sequence (SEQ ID NO:61). 
Figures 60A and 60B are the rnFATP4 partial DNA sequence (SEQ ID 
NO:62). Coding region: 1-1218. 
25 Figure 61 is the rnFATP4 partial DNA sequence (SEQ ID NO:63). 

Figures 62A, 62B S and 62C are the mmFATPl DNA sequence (SEQ ID 
NO:64). Coding region: 1-1944. 

Figure 63 is the mmFATPl protein sequence (SEQ ID NO:65). 
Figures 64A and 64B are the mmFATP2 DNA sequence (SEQ ID NO:66). 
30 Coding region: 121-1992 (1872 nt). 

Figure 65 is the mmFATP2 protein sequence (SEQ ID NO:67). 
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Figures 66A and 66B are the mmFATP3 partial DNA sequence (SEQ ID 
NO:68). Coding region: 1-1830. 

Figure 67 is the mmFATP3 partial protein sequence (SEQ ID NO:69). 

Figures 68 A, 68B, and 68C are the mmFATP4 DNA sequence (SEQ ID 
5 NO:70). Coding region: 1-1932. 

Figures 69 is the mmFATP4 protein sequence (SEQ ID NO:71). 

Figures 70A and 70B are the mmFATPS DNA sequence (SEQ ID NO:72). 
Coding region: 60-2129. 

Figure 71 is the mmFATPS protein sequence (SEQ ID NO:73). 
10 Figures 72A and 72B are the dmFATP partial DNA sequence 

{dm-Drosophila melanogaster\ SEQ ID NO:74). Coding region: 1-1773. 

Figure 73 is the dmFATP partial protein sequence (SEQ ID NO:75). 

Figure 74 is the drFATP partial DNA sequence {&r=Danio rerio, zebrafish; 
SEQIDNO:76) Coding region: 1-173. 
1 5 Figure 75 is the drFATP partial protein sequence (SEQ ID NO: 77). 

Figure 76A and 76B are the ceFATPa DNA sequence (SEQ ID NO:78). 
Coding region: 1-1953. 

Figure 77 is the ceFATPa protein sequence (SEQ ID NO: 79). 

Figures 78A and 78B are the ceFATPb DNA sequence (SEQ ID NO:80). 
20 Coding region: 1-1968. 

Figure 79 is the ceFATPb protein sequence (SEQ ID NO:81). 

Figures 80 A and 80B are the chFATP DNA sequence (SEQ ID NO: 82; * 
ch=Cochliobolu heterostrophus). Coding region: 1-1932. 

Figure 81 is the chFATP protein sequence (SEQ ID NO:83). 
25 Figure 82 is the anFATP partial protein sequence (an=Aspergillns nidulans; 

SEQIDNO:84). Coding region: 1-597. 

Figure 83 is the anFATP partial protein sequence (SEQ ED NO:85). 

Figure 84 is the mgFATP partial DNA sequence (mg= Magnaporthe grisea, 
rice blast; SEQ ID NO:86). Coding region: 1-522. 
30 Figure 85 is the mgFATP partial protein sequence (SEQ ID NO: 87). 
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Figures 86A and 86B are the scFATP DNA sequence (SEQ ID NO:88). 
Coding region: 1-1872. 

Figure 87 is the scFATP protein sequence (SEQ ID NO:89). 
Figures 88A and 88B are the mtFATP DNA sequence (SEQ ID NO:90). 
5 Figure 89 is the mtFATP protein sequence (SEQ ID NO:91). Coding region: 

1-1794. 

Figure 90 is a consensus sequence of the FATP signature sequence (SEQ ID 
NO: 100), based on 23 independent sequences aligned in ClustalX. The height of the 
bar at each amino acid residue position indicates the degree of conservation at that 
10 position. Gaps have been inserted to maintain the strength of the alignment. 

Figure 91 is a hydrophilicity plot for hsFATP2, made using the Kyte-Doolittle 
method, averaging hydrophilicity values for 18 amino acid residues at a time. 

Figure 92 is a hydrophilicity plot for the hsFATP3 partial protein, made using 
the Kyte-Doolittle method, averaging hydrophilicity values for 18 amino acid 
1 5 residues at a time. 

Figure 93 is a hydrophilicity plot for the hsFATPS partial protein, made using 
the Kyte-Doolittle method, averaging hydrophilicity values for 18 amino acid 
residues at a time. 

Figures 94A and 94B are a representation of the DNA sequence (SEQ ID 
20 NO:101) of the hsFATP3 gene, and the amino acid sequence (SEQ ID NO:102) of the 
hsFATP3 protein. 

Figure 95 shows that mammalian expression constructs containing either 
hsFATP4 (squares and triangles) or empty control vector (circles) were stably 
transfected into 293 cells. Short-term uptake of Bodipy-FA in the presence of BSA 

25 was determined by FACS. The mean fluorescence of the viable cell population is 

expressed in arbitrary fluorescence units. FATP4 protein expression was determined 
by densitometry of anti-FATP4 Western blots, and is expressed in arbitrary units. 

Figure 96 is a bar graph illustrating short-term uptake of Bodipy-palmitate (1 
|lM), either by control cells (black bars) or FATP4-expressing cells (hatched bars), 

30 was measured in the presence of 0, 10, 100 |iM unlabeled palmitate. FA uptake was 
quantified by FACS and expressed in arbitrary fluorescence units. 
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Figure 97 shows the rate of [ 2 H]palmitate uptake by 293 cells, which were 
stably transfected with a construct for either human FATP4 (diamonds) or an empty 
vector (circles), compared to that of isolated enterocytes (squares). 

Figure 98 is a bar graph illustrating the results when isolated enterocytes were 
5 incubated for 48h with increasing concentrations of the FATP4 antisense 

oligonucleotide or with 100 |iM of a randomized control oligonucleotide with 
identical nucleotide composition to the FATP4 antisense oligonucleotide. The uptake 
of oleate by the enterocytes was then measured over a 5 min time interval (solid bars). 
In parallel, the levels of FATP4 protein and, as a loading control, P-catenin, were 
10 determined by Western blotting and quantitated using densitometry (hatched bars). 
FA uptake and FATP4 protein levels were normalized to that of untreated cells. The 
averages and standard deviations of 4 independent experiments are shown. 

Figure 99 is a bar graph illustrating the uptake rates of [ 3 H]oleate, 
[ 3 H]palmitate and [ 35 S]methionine by primary enterocytes were measured after 48h 
15 incubation with either 100 |iM FATP4 antisense (solid bars) or 100 |aM randomized 
control oligonucleotide (hatched bars) and expressed as % of untreated cells. 

Figure 100 is a bar graph illustrating that 8 kb of FATP5 genomic sequence 
SEQ ID NO.: 106 is sufficient for liver specific transcription in vitro. A luciferase 
reporter construct containing 8 kb upstream of the FATP5 initiator methionine was 
20 transfected into various cell lines using calcium phosphate as described in Example 
17. Forty-eight hours after transfection, luciferase activity was measured and 
normalized to p-galactosidase activity. For each cell line, fold induction was 
determined by dividing the relative luciferase activity of the 8 kb construct by that of 
the promoter- less luciferase reporter vector. The data shown represent the mean of 
25 three experiments done in triplicate. Error bars indicate the SEM. 

Figure 101 is a bar graph illustrating deletion analysis of the FATP5 promoter. 
Constructs containing deletions of the FATP5 promoter were transfected into HepG2 
cells, assayed for luciferase activity, and normalized to P-galactosidase (RLU). The 
labels on the vertical axis correspond to the length of the promoter segment as 
30 measured from the initiator methionine. The data shown represents the mean of three 
experiments done in triplicate. Error bars indicate the SEM. 
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Figure 102 is a bar graph illustrating that 271 base pairs upstream of the 
FATP5 initiator methionine are sufficient for liver specific luciferase activity. A 
luciferase reporter construct containing 271 base pairs upstream of the FATP5 
initiator methionine was transfected into various cell lines using calcium phosphate as 
5 described in Methods Example 17. Forty eight hours after transfection, luciferase 
activity was measured and normalized to P-galactosidase activity. For each cell line, 
fold induction was determined by dividing the relative luciferase activity of the -271 
base pair construct by that of the promoter-less luciferase reporter vector. The data 
shown represent the mean of three experiments done in triplicate. Error bars indicate 
10 theSEM. 

Figures 103 A and 103B illustrate mutations of the GC box which abolish 
transcriptional activity. A: Schematic of mutations in the GC box aligned with the 
normal sequence (SEQ ED NO.: 106, SEQ ED NO.: 107, SEQ ID NO.: 108). The GC 
box consensus sequence is underlined. B; Constructs containing 271 base pairs 

1 5 upstream of the FATP5 initiator methionine with the mutations in the GC box 

depicted in part A were transfected into HepG2 cells, assayed for luciferase activity, 
and normalized to p-galactosidase (RLU). The data shown represent the mean of 
three experiments done in triplicate. Error bars indicate the SEM. 

Figure 104 shows a gel shift analysis of the GC box with HepG2 nuclear 

20 extracts. Schematic showing the sequence of the oligonucleotides used in gel shift 
studies. The numbering reflects the distance from the initiator methionine. The two 
pairs of oligonucleotides are indicated by the lines and labeled AF-1 (SEQ ID NO.: 
Ill, SEQ ID NO.: 112) and AF-2 (SEQ ID NO.: 109, SEQ ID NO.: 110). 

Figure 105 is a bar graph illustrating that 30bp internal deletions of the 

25 FATP5 promoter identify another region required for luciferase activity in HepG2 
cells. Reporter constructs were transfected into HepG2 cells. Luciferase activity was 
measured and normalized to P-galactosidase activity (RLU). The labels on the 
horizontal axis correspond to the nucleotides that were deleted and the numbering on 
the vertical axis represents the distance from the initiator methionine. The data 

30 shown represent the mean of three experiments done in triplicate. Error bars indicate 
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the SEM. Note that the five fold higher RLU activity in this figure relative to Figures 
101 and 103 is the result of a manufacturer change in the P-galactosidase reagent. 

Figure 106 is a bar graph illustrating that a linker scan of the FATP5 promoter 
identifies two additional elements required for transcription in HepG2 cells. Reporter 
5 constructs were transfected into HepG2 cells. Luciferase activity was measured and 
normalized to P-galactosidase activity (RLU). The labels on the horizontal axis 
correspond to the constructs in part A. The data shown represent the mean of three 
experiments done in triplicate. Error bars indicate the SEM. Please note that the 
lower RLU activity in this figure relative to Figures 101 and 103 is also the result of a 

10 manufacturer change in the P-galactosidase reagent. 

Figure 107 is a schematic of the FATP5 promoter (SEQ ID NO.: 113). The 
GC box and two motifs identified in the linker scan are boxed and labeled. An arrow 
indicates the translational initiator of the FATP5 protein. The two halves of the 
palindrome contained in the novel motifs and referred to in the discussion are 

15 underlined. 

Figure 108 is a photograph showing FATP2 expression in the mouse gall 
bladder epithelium. 

Figure 109 is a photograph showing FATP2 expression in chimpanzee liver. 
Figure 1 10 is a photograph showing FATP5 expression in chimpanzee liver. 
20 Figures 1 1 1A and 1 1 IB represent the DNA sequence (SEQ ID NO:l 16) of 

human FATP3. 

Figure 1 12 represents the amino acid sequence (SEQ ID NO:l 17) of human 
FATP3. 

Figure 1 13 is a bar graph showing the results of an experiment comparing 
25 fatty acid transport between cells transfected with SEQ ID NO: 116 and untransfected 
cells. 

Figures 1 14 A, 1 14B, 1 14C and 1 14D represent portions of the amino acid 
sequence of mmFATP4 which were produced as fusion polypeptides in E. coli cells. 
Figure 115 is a schematic illustrating certain components of the fusion 
30 polypeptides depicted in Figures 114A-D. The schematic shows the lipocalin domain 
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as well as other identified motifs and notes the relative location of each in the 
mmFATP4 fusion polypeptide. 

Figure 1 16 is a bar graph illustrating the results of an experiment comparing 
the binding capabilities of the fusion polypeptides shown in Figures 1 14A-D for an 
5 oleate fatty acid. 

Figure 1 17 is a bar graph showing the results of an experiment comparing 
binding of various fatty acids between two of the fusion polypeptides depicted in 
Figure 114A-D. 

Figure 1 18A-G illustrates the consensus sequence of bsFATPl, hsFATP2, 
10 hsFATP3, hsFATP4, hsFATPS and hsFATP6 with the lipocalin domain and AMP- 
binding domain of each noted. 



The foregoing and other objects, features and advantages of the invention will 
be apparent from the following more particular description of preferred embodiments 
of the invention, as illustrated in the accompanying drawings in which like reference 
15 characters refer to the same parts throughout the different views. The drawings are 
not necessarily to scale, emphasis instead being placed upon illustrating the principles 
of the invention. 



DETAILED DESCRIPTION OF THE INVENTION 

As described herein, FATPs are a large evolutionary conserved family of 

20 proteins that mediate the transport of LCFAs into cells. The family includes proteins 
which are conserved from mycobacteria to humans and exhibit very different 
expression patterns in tissues. Specific embodiments described include FATPs from 
mice, humans, nematodes, fungi and mycobacteria which have been shown to be 
functional LCFA transporters. The term 'Tatty acid transport proteins" ("FATPs") as 

25 used herein, refers to the proteins described herein as FATP1, FATP2, FATP3, 

FATP4, FATP5 and FATP6, which have been described in one or more species of 
mammals, as well as mtFATP, ceFATP, scFATP, anFATP, mgFATP, and chFATP, 
and other proteins sharing at least about 50% amino acid sequence similarity, 



WO 01/21795 



PCT/US00/25891 



-18- 

preferably at least about 60% sequence similarity, more preferably at least about 70% 
sequence similarity, and still more preferably, at least about 80% sequence similarity, 
and most preferably, at least about 90% sequence similarity in the approximately 360 
amino acid signature sequence. The approximately 360 amino acid FATP signature 
5 sequence is shown in Figure 1. The consensus sequence of the signature sequence is 
shown in Figure 90. The nomenclature used herein to refer to FATPs includes a 
species-specific prefix (e.g., mm, Mus musculus; hs or h, Homo sapiens or human; mt 
M tuberculosis; dm, D. melanogaster; ce, C. elegans; sc, Saccharomyces cerevisiae) 
and a number such that mammalian homologues in different species share the same 

1 0 number. For example, six human and five mouse FATP genes which are expressed in 
a variety of tissues are described herein and are referred to, respectively, as hsFATPl- 
hsFATP6 and mmFATP 1 -mmFATP5 ; for example, hsFATP4 and mmFATP4 are the 
human and mouse orthologs. 

Expression patterns of human and mouse FATPs have been assessed and are 

15 described below. Briefly, results of these assessments show that FATP5 is a liver- 
specific gene. FATP2 is highly expressed in liver, kidney and gall bladder 
epithelium. Both of these proteins, as well as FATP4 and FATPs from nematodes 
and mycobacteria, have been shown to be functional LCFA transporters. Results 
have also shown that FATP4 mRNA is present at high levels in epithelial cells of two 

20 regions of the small intestine (the jejunum and ileum) and at lower, but significant, 
levels in a third region (the duodenum). They further showed that FATP2 mRNA is 
present in epithelial cells of the duodenum at a level similar to that of FATP4 mRNA 
levels, but is present at lower levels in the jejunum and ileum. FATP4 mRNA was 
absent from other cell types of the small intestine and no FATP4 mRNA could be 

25 detected in any cells of the colon. No signals above background could be detected for 
FATP1, FATP 3 and FATPS in any of the intestinal tissues. Thus, FATP4 is the 
major FATP in the mouse small intestine, which supports a major role for FATP4 
(along with FATP2 to a lesser extent) in absorption of free fatty acids. hsFATP4 was 
clearly expressed in the jejunum and ileum; expression was absent in the stomach. 

30 This, too, is consistent with a major role for FATP 4 in absorption of fatty acids in the 
human gut. Analysis of FATP expression in human tissues, also described in detail 
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below, showed that hsFATP6, which has no mouse ortholog as yet, is expressed at 
high levels in the heart and at low levels in the placenta, but is undetectable in the 
other tissues assessed (Example 9). This is consistent with a major role for FATP6 in 
absorption of fatty acids in the heart. 
5 Analysis of FATP3 expression in murine tissues, also described in detail 

below, showed that expression occurs at detectable levels in liver, spleen, heart, 
kidney, testis, white adipose tissue, exocrine and endocrine pancreatic cells, and also 
in lung tissues. FATP3 is expressed at high levels in type-II pneumocytes, a cell type 
noted for secretion a surfactant, a phospholipid-rich film critical for lung function 
10 (Example 19). 

Long chain fatty acids (LCFAs) are an important energy source for pro- and 
eukaryotes and are involved in diverse cellular processes, such as membrane 
synthesis, intracellular signaling, protein modification, and transcriptional regulation. 
In developed Western countries, human dietary lipids are mainly di- and triglycerides 
15 and account for approximately 40% of caloric intake (Weisburger, J. H. (1997) J. Am. 
Diet. Assoc. P7:S16-S23). These lipids are broken down into fatty acids and glycerol 
by pancreatic lipases in the small intestine (Chapus, C, Rovery, M., Sarda, L & 
Verger, R. (1988) Biochimie 70:1223-34); LCFAs are then transported into brush 
border cells, where the majority is re-esterified and secreted into the lymphatic system 
20 as chylomicrons (Green, P.H. & Riley, J.W. (1981) Aust. N.Z.J. Med. 77:84-90). 

Fatty acids are liberated from lipoproteins by the enzyme lipoprotein lipase, which is 
bound to the luminal side of endothelial cells (Scow, R.O. & Blachette-Mackie, E.J. 
(1992) Mol. Cell. Biochem 776:181-191). "Free" fatty acids in the circulation are 
bound to serum albumin (Spector, A. A. (1984) Clin. Physiol. Biochem 2:123-134) 
25 and are rapidly incorporated by adipocytes, hepatocytes, and cardiac muscle cells. 

The latter derive 60-90% of their energy through the oxidation of LCFAs (Neely, J.F. 
Rovetto, M.J. & Oram, J.F. (1972) Prog. Cardiovasc. Dis: 75:289-329). Although 
saturable and specific uptake of LCFAs has been demonstrated for intestinal cells, 
hepatocytes, cardiac myocytes, and adipocytes, the molecular mechanisms of LCFA 
30 transport across the plasma membrane have remained controversial (Hui, T.Y. & 

Bernlohr, D.A. (1997) Front. BioscL 75:d222-31-d231; Schaffer, J.E. & Lodish, H.F, 
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(1995) Trends Cardiovasc. Med. 5:218-224). Described herein is a large family of 
highly homologous mammalian LCFA transporters which show wide expression, 
including in all tissues relevant to fatty acid metabolism. Further described are novel 
members of this family in other species, including mycobacterial and nematode 
5 FATPs which, like their mammalian counterparts, are functional fatty acid 
transporters. 

The discovery of a diverse but highly homologous family of FATPs is 
reminiscent of the glucose transporter family. In a manner similar to the FATPs, the 
glucose transporters have very divergent patterns of tissue expression (McGowan, 
10 K.M., Long, S.D. & Pekala, P.H. (1995) Pharmacol Ther. 66:465-505). The FATPs, 
like glucose transporters, may also differ in their substrate specificities, uptake 
kinetics, and hormonal regulation (Thorens, B. (1996) Am. J. Physiol. 270:G541- 
G553). Indeed, the levels of fatty acids in the blood, like those of glucose, can be 
regulated by insulin and are dysregulated in diseases such as noninsulin-dependent 
15 diabetes and obesity (Boden, G. (1997) Diabetes 46:3-10). The underlying 

mechanisms for the regulation of free fatty acid concentrations in the blood are not 
understood, but could be explained by hormonal modulation of FATPs. 

Insulin-resistance is thought to be the major defect in non insulin-dependent 
diabetes mellitus (NIDDM) and is one of the earliest manifestations of NIDDM 
20 (McGarry (1992) Science 258:766-770). Free fatty acids (FFAs) may provide an 
explanation for why obesity is a risk factor for NIDDM. Plasma levels of FFAs are 
elevated in diabetic patients (Reaven et al (1988) Diabetes 37:1020). Elevated 
plasma free fatty acids (FFAs) have been demonstrated to induce insulin-resistance in 
whole animals and humans (Boden (1998) Front. Biosci. 3:D169-D175). This 
25 insulin-resistance is likely mediated by effects of FFAs on a variety of issues. FFAs 
added to adipocytes in vitro induce insulin resistance in this cell type as evidenced by 
inhibition of insulin-induced glucose transport (Van Epps-Fung et al. (1997) 
Endocrinology 138:4338-4345). Rats fed a high fat diet developed skeletal muscle 
insulin resistance as evidenced by a decrease in insulin-induced glucose uptake by 
30 skeletal muscle (Han et ai, (1997) Diabetes 46:1761-1767). In addition, elevated 
plasma FFAs increase insulin-suppressed endogenous glucose production in the liver 
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(Boden (1998) Front. Bioscl 3:D169-D175), thus increasing hepatic glucose output. 
It has been postulated that the adverse effects of plasma free fatty acids are due to the 
FFAs being taken up into the cell, leading to an increase in intracellular long chain 
fatty acyl CoA; intracellular long chain acyl CoAs are thought to mediate the effects 
5 of FFAs inside the cell. Thus, fatty acid induced insulin-resistance may be prevented 
by blocking uptake of FFAs into select tissues, in particular liver (by blocking FATP2 
and/or FATP5), adipocyte (by blocking FATP1), and skeletal muscle (by blocking 
FATP1). Blocking intestinal fat absorption (by blocking FATP4) is also expected to 
reduce plasma FFA levels and thus improve insulin resistance. * 
10 During the pathogenesis of NEDDM insulin-resistance can initially be 

counteracted by increasing insulin output by the pancreatic beta cell. Ultimately, this 
compensation fails, beta cell function decreases and overt diabetes results (McGarry 
(1992) Science 258: 766-770). Manipulating beta cell function is a second point 
where fatty acid transporter blockers may be beneficial for diabetes. While no FATP 
15 homolog has been identified so far that is expressed in the beta cell of the pancreas, 
the data described below suggest the existence of such a transporter and the sequence 
information included herein provides the means to identify such a transporter by 
degenerate PCR, using primers to regions conserved in all FATP family members or 
by low stringency hybridization. It has been demonstrated that exposure of pancreatic 
20 beta-cells to FFAs increases the basal rate of insulin secretion; this in turn leads to a 
decrease in the intracellular stores of insulin, resulting in decreased capacity for 
insulin secretion after chronic exposure (Bollheimer et aL, (1998) J. Clin. Invest. 
101 : 1094-1 101). The effects of FFAs are again likely to be mediated by intracellular 
long chain fatty acyl CoA molecules (Liu et aL, (1998) J. Clin. Invest. 101:1870- 
25 1875). FFAs have also been demonstrated to increase beta cell apoptosis 

(Shimabukuro et aL, (1998) Proc. Nat. Acad. ScL USA 95:2498-2502), possibly 
contributing to the decrease in beta cell numbers in late stage NIDDM. 

Another finding with potentially broad implications is the identification of a 
FATP homologue in M. tuberculosis. Tuberculosis causes more deaths worldwide 
30 than any other infectious agent and drug-resistant tuberculosis is re-emerging as a 
problem in industrialized nations (Bloom, B.R. & Small, P.M. (1998) N. Engl. J. 



WO 01/21795 



PCT/US00/25891 



Med. 335:677-678). Mycobacterium tuberculosis has about 250 enzymes involved in 
fatty acid metabolism, compared with only about 50 in E, coli. It has been suggested 
that, living as a pathogen, the mycobacteria are largely lipolytic, rather than lipogenic, 
relying on the lipids within mammalian cells and the tubercle (Cole, S.T. et al, 
5 Nature 393:537-544 (1998)). The de novo synthesis of fatty acids in Mycobacterium 
leprae is insufficient to maintain growth (Wheeler, P.R., Bulmer, K & Ratledge, C. 
(1990) J. Gene. Microbiol 1 36:21 1-217). Thus, it is reasonable to expect that 
inhibitors of mtFATP will serve as therapeutics for tuberculosis. FATPs expressed in 
mycobacteria can be targeted to reduce or prevent replication of mycobacteria (e.g., to 

10 reduce or prevent replication of M tuberculosis) and, thus, reduce or prevent their 

adverse effects. For example, a FATP or FATPs expressed by M. tuberculosis can be 
targeted and inhibited, thus reducing or preventing growth of this pathogen (and 
tuberculosis in humans and other mammals). An inhibitor of an M. tuberculosis 
FATP can be identified, using methods described herein (e.g., expressing the FATP 

15 in an appropriate host cell, such as E. coli or COS cells; contacting the cells with an 
agent. or drug to be assessed for its ability to inhibit the FATP and, as a result, 
mycobacterial growth, and assessing its effects on growth). A drug or agent 
identified in this manner can be further tested for its ability to inhibit a M. 
tuberculosis FATP and M tuberculosis infection in an appropriate animal model or 

20 in humans. A method of inhibiting mycobacterial growth, particularly growth of M. 
tuberculosis, and compounds useful as drugs for doing so are also the subject of this 
invention. 

An isolated polynucleotide encoding mtFATP, like other polynucleotides 
encoding FATPs of the FATP family, can be incorporated into vectors, nucleic acids 

25 of viruses, and other nucleic acid constructs that can be used in various types of host 
cells to produce mtFATP. This mtFATP can be used, as it appears on the surface of 
cells, or in various artificial membrane systems, to assess fatty acid transport 
function, to identify ligands and molecules that are modulators of fatty acid transport 
activity. Molecules found to be inhibitors of mtFATP function can be incorporated 

30 into pharmaceutical compositions to administer to a human for the treatment of 
tuberculosis. 
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Particular embodiments of the invention are polynucleotides encoding a 
FATP of Cochliobolus (Helminthosporiwn) heterostrophus or portions or variants 
thereof, the isolated or recombinantly produced FATP, methods for assessing whether 
an agent binds to the chFATP, and further methods for assessing the effect of an 
5 agent being tested for its ability to modulate fatty acid transport activity. 

Cochliobolus heterostrophus is an ascomycete that is the cause of southern corn leaf 
blight, an economically important threat to the corn crop in the United States. The 
related species C sativus causes crown rot and common root rot in wheat and barley. 
One or more FATPs of G heterostrophus can be targeted for the identification of an 

10 inhibitor of chFATP function, which can be then be used as an agent effective against 
infection of plants by C. heterostrophus and related organisms. Methods described 
herein that were applied in studying the expression of a FATP gene and the function 
of the FATP in its natural site of expression or in a host cell, can be used in the study 
of the chFATP gene and protein. 

15 Magnaporthe grisea (rice blast) is an economically important fungal pathogen 

of rice. Further embodiments of the invention are nucleic acid molecules encoding a 
FATP of Magnaporthe grisea, portions thereof, or variants thereof, isolated 
mgFATP, nucleic acid constructs, and engineered cells expressing mgFATP. Other 
aspects of the invention are assays to identify an agent which binds to mgFATP and 

20 assays to identify an agent which modulates the function of mgFATP in cells in 

which mgFATP is expressed or in artificial membrane systems. Agents identified as 
inhibiting mgFATP activity can be developed into anti-fungal agents to be used to 
treat rice infected with rice blast. 

Caenorhabditis elegans is a nematode related to plant pathogens and human 

25 parasites. An isolated polynucleotide which encodes ceFATP, like other 

polynucleotides encoding FATPs of the FATP family described herein, can be 
incorporated into nucleic acid vectors and other constructs that can be used in various 
types of cells to produce ceFATP. ceFATP as it occurs in cells or as it can be isolated 
or incorporated into various artificial or reconstructed membrane systems, can be 

30 used to assess fatty acid transport, and to identify ligands and agents that modulate 
fatty acid transport activity. Agents found by such assays to be inhibitors of ceFATP 
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activity can be incorporated into compositions for the treatment of diseases caused by 
genetically related organisms with a FATP of similar sensitivity to the agents. 

Aspergillus nidulans is one of a family of fungal species that can infect 
humans. Further embodiments of the invention of the family of polynucleotides 
5 encoding FATPs are polynucleotides encoding a FATP of Aspergillus nidulans, and 
vectors and host cells that can be constructed to comprise such polynucleotides. 
Further embodiments are a polypeptide encoded by such polynucleotides, portions 
thereof having one or more functions characteristic of a FATP, and various methods. 
The methods include those for identifying agents that bind to a FATP and those for 
10 assessing the effect of an agent being tested for its ability to modulate fatty acid 

transport activity. Those agents found to inhibit fatty acid transport function can be 
used in compositions as anti-fungal pharmaceuticals, or can be modified for greater 
effectiveness as a pharmaceutical. 

One aspect of the invention relates to isolated nucleic acids that encode a 
15 FATP as described herein, such as those FATPs having an amino acid sequence in 
Figure 45 (SEQ ID NO:47), Figure 47 (SEQ ID NO:49), Figure 112 (SEQ ID NO: 
117), Figure 51 (SEQ ID NO:53), Figure 53 (SEQ ID NO:55), and Figure 55 (SEQ 
ID NO:57) and nucleic acids closely related thereto as described herein. 

Using the information provided herein, such as a nucleic acid sequence set 
20 forth in Figures 44A-44C (SEQ ID NO:46), Figures 46A and 46B (SEQ ID NO:48), 
Figure 1 12 (SEQ ID NO:l 16), Figures 50A-50C (SEQ ED NO:52), Figure 52 (SEQ 
ID NO:54), and Figures 54A-54C (SEQ ID NO:56), a nucleic acid of the invention 
encoding a FATP polypeptide has been obtained using standard cloning and 
screening methods, such as those for cloning and sequencing cDNA library 
25 fragments, followed by obtaining a full length clone. For example, to obtain a nucleic 
acid of the invention, a library of clones of cDNA of human or other mammalian 
DNA can be probed with a labeled oligonucleotide, such as a radiolabeled 
oligonucleotide, preferably about 17 nucleotides or longer, derived from a partial 
sequence. Clones carrying DNA identical to that of the probe can then be 
30 distinguished using stringent (also, "high stringency") hybridization conditions. By 
sequencing the individual clones thus identified with sequencing primers designed 
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from the original sequence it is then possible to extend the sequence in both 
directions to determine the full length sequence. Suitable techniques are described, 
for example, in Current Protocols in Molecular Biology (F.M. Ausubel et al, eds), 
containing supplements through Supplement 42, 1998, John Wiley and Sons, Inc., 
5 especially chapters 5, 6 and 7. 

Embodiments of the invention include isolated nucleic acid molecules 
comprising any of the following nucleotide sequences: /.) a nucleotide sequence 
which encodes a protein comprising the amino acid sequence of hsFATPl (SEQ ID 
NO:47), the amino acid sequence of hsFATP2 (SEQ ED NO:49), the amino acid 
1 0 sequence of hsFATP3 (SEQ ED NO: 1 1 7), the amino acid sequence of hsF ATP4 (SEQ 
ID NO: 53), the amino acid sequence of hsFATPS (SEQ ID NO:55) or the amino acid 
sequence of hsFATP6 (SEQ ED NO:57); 2.) nucleotide sequences of hsFATPl, 
hsFATP2, hsFATP3, hsFATP4, hsFATPS, or hsFATP6 (SEQ ID NO:46, 48, 1 16, 52, 
54, or 56, respectively); 3.) a nucleotide sequence which is complementary to the 
1 5 nucleotide sequence of hsFATPl (SEQ ID NO:46), hsFATP2 (SEQ ED NO:48), 

hsFATP3 (SEQ ED NO:l 16), hsFATP4 (SEQ ID NO:52), hsFATPS (SEQ ED NO:54) 
or hsFATP6 (SEQ ED NO:56); 4.) a nucleotide sequence which consists of the 
coding region of hsFATPl (SEQ ID NO:46), the coding region of hsFATP2 (SEQ ID 
NO:48), the coding region of hsFATP3 (SEQ ID NO: 1 1 6), the coding region of 
20 hsFATP4 (SEQ ID NO:52), the coding region of hsFATPS (SEQ ID NO:54), or the 
coding region of hsFATP6 (SEQ ID NO:56). 

The invention further relates to nucleic acids (nucleic acid molecules or 
polynucleotides) having nucleotide sequences identical over their entire length to 
those shown in the figures, for instance Figures 44A-44C (SEQ ID NO:46), Figures 
25 46 A and 46B (SEQ ED NO:48), Figures 1 1 1 A-B (SEQ ID NO:l 16), Figures 50A-50C 
(SEQ ID NO:52), Figure 52 (SEQ ID NO:54), and Figures 54A-54C (SEQ ID 
NO:56). It further relates to DNA, which due to the degeneracy of the genetic code, 
encodes a FATP encoded by one of the FATP-encoding DNAs, whose amino acid 
sequence is provided herein. Also provided by the invention are nucleic acids having 
30 the coding sequences for the mature polypeptides or fragments in reading frame with 
other coding sequences, such as those encoding a leader or secretory sequence, a pre-, 
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or pro- or prepro- protein sequence. The nucleic acids of the invention encompass 
nucleic acids that include a single continuous region or discontinuous regions 
encoding the polypeptide, together with additional regions, that may also contain 
coding or non-coding sequences. The nucleic acids may also contain non-coding 
5 sequences, including, for example, but not limited to, non-coding 5' and 3 f sequences, 
such as the transcribed, non-translated sequences, termination signals, ribosome 
binding sites, sequences that stabilize mRNA, introns, polyadenylation signals, and 
additional coding sequences which encode additional amino acids. For example, a 
marker sequence that facilitates purification of the fused polypeptide can be encoded. 

10 In certain embodiments of the invention, the marker sequence can be a hexa-histidine 
peptide, as provided in the pQE vector (Qiagen, Inc., Venlo, The Netherlands) and 
described in Gentz et aL, Proc. Natl Acad. Sci. USA 86: 821-824 (1989), or an HA 
tag (Wilson et aL, Cell 37: 767 (1984)), or a sequence encoding glutathione S- 
transferase of Schistosoma japonicum (vectors available from Pharmacia; see Smith, 

15 D.B. and Johnson K.S., Gene 57:31 (1988) and Kaelin, W.G. et aL, Cell 70:351 
(1992)). Nucleic acids of the invention also include, but are not limited to, nucleic 
acids comprising a structural gene and its naturally associated sequences that control 
gene expression. 

The invention further relates to nucleic acids (nucleic acid molecules or 
20 polynucleotides) that encode a FATP polypeptide. In a particular embodiment, a 
nucleic acid encodes a portion of a FATP which includes a motif or domain, for 
example, a lipocalin domain or an AMP-binding domain. Such a polypeptide portion 
can be a functional portion of a FATP protein. The term "lipocalin domain" is an art 
recognized term and as used herein refers to a particular domain present in FATP 
25 proteins. This domain is described as including regions of sequence homology as 
well as a common tertiary structure represented as an eight stranded antiparallel beta- 
barrel, (see Banaszak, L. et al., Advances in Protein Chemistry, 45: 89-151). Many 
lipocalin domains can be identified structurally as a sequence contained within the 
general formula: [DENG]-X-[DENQGSTARK]-X(0,2)-[DENQARK]-[LIVFY]- 
30 {CP}-G-{C}-W-[FYWLRH-X]-[LIVMTA], e.g., the lipocalin signature sequence or 
consensus pattern (SEQ ID NO: 125). One skilled in the art will recognize that a 
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lipocalin domain for a particular FATP protein can vary in sequence from this general 
formula. A FATP lipocalin domain can be, for example, identical to the lipocalin 
signature sequence or can exhibit 60, 65, 70, 75, 80, 85, 90, 95 or greater per cent 
sequence identity compared to the general formula provided that it retains lipocalin 
5 binding function. For example, a lipocalin domain for each of the human FATPs, 
hsFATPl (SEQ ID NO: 126), hsFATP2 (SEQ ID NO: 127), hsFATP3 (SEQ ID NO: 
128), hsFATP4 (SEQ ID NO: 129), hsFATP5 (SEQ ID NO: 130), and hsFATP6 
(SEQ ID NO: 131) has been identified. The pattern of these lipocalin domains are 
highly conserved across the FATP family. 

10 A nucleic acid encoding a portion of a FATP polypeptide can encode one or 

more domains, and also can include additional nucleotides. For example, the nucleic 
acid can also include nucleotide sequences that encode a portion of a FATP protein 
that is upstream from a lipocalin domain. As the term "upstream" or "upstream 
sequences" is used herein in relation to the lipocalin domain, it is intended to refer to 

1 5 the nucleotide sequence which encodes all or a portion of a FATP protein located 
between the signal peptide (when one is present) and the lipocalin domain. In the 
absence of a signal peptide, the term refers to the nucleotide sequence which encodes 
all or a portion of a FATP protein between the lipocalin domain and the amino 
terminus (see Figure 1 1 5). 

20 The invention further relates to variants, including naturally-occurring allelic 

variants, of those nucleic acids described specifically herein by DNA sequence, that 
encode variants of such polypeptides as those having the amino acid sequences 
shown in Figure 45 (SEQ ID NO:47), Figure 47 (SEQ ID NO:49), Figure 1 12 (SEQ 
ID NO:l 17), Figure 51 (SEQ ID NO:53) Figure 53 (SEQ ID NO:55), or Figure 55 

25 (SEQ ID NO:57). Such variants include nucleic acids encoding variants of the 

above-listed amino acid sequences, wherein those variants have several, such as 5 to 
10, 1 to 5, or 3, 2 or 1 amino acids substituted, deleted, or added, in any combination. 
Variants include polynucleotides encoding polypeptides with at least 95% but less 
than 100% amino acid sequence identity to the polypeptides described herein by 

30 amino acid sequence. Variant polynucleotides hybridize, under low to high 

stringency conditions, to the alleles described herein by DNA sequence. In one 
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embodiment, variants have silent substitutions, additions and deletions that do not 
alter the properties and activities of the FATP. Allelic variants of the polynucleotides 
encoding hsFATPl (Figure 45; SEQ ID NO:47), hsFATP2 (Figure 47; SEQ ID 
NO:49), hsFATP3 (Figure 1 12; SEQ ID NO:l 17), hsFATP4 (Figure 51; SEQ ID 
5 NO:53), hsFATPS (Figure 53; SEQ ID NO:55) and hsFATP6 (Figure 55; SEQ ID 
NO:57) will be identified as mapping to chromosomal locations listed for the 
corresponding wild type genes in Table 2 in Example 1. 

Orthologous genes are gene loci in different species that are sufficiently 
similar to each other in their nucleotide sequences to suggest that they originated 
10 from a common ancestral gene. Orthologous genes arise when a lineage splits into 
two species, rather than when a gene is duplicated within a genome. Proteins that are 
orthologs are encoded by genes of two different species, wherein the genes are said to 
be orthologous. 

The invention further relates to polynucleotides encoding polypeptides which 

15 are orthologous to those polypeptides having a specific amino acid sequence 

described herein, such as the amino acid sequences shown in Figure 45 (SEQ ID 
NO:47), Figure 47 (SEQ ID NO:49), Figure 1 12 (SEQ ID NO:l 17), Figure 51 (SEQ 
ID NO:53), Figure 53 (SEQ ID NO:55), or Figure 55 (SEQ ID NO:57). These 
polynucleotides, which can be called ortholog polynucleotides, encode orthologous 

20 polypeptides that can range in amino acid sequence identity to a reference amino acid 
sequence described herein, from about 65% to less than 100%, but preferably 70% to 
80%, more preferably 80% to 90%, and still more preferably 90% to less than 100%. 
Orthologous polypeptides can also be those polypeptides that range in amino acid 
sequence similarity to a reference amino acid sequence described herein from about 

25 75% to 100%, within the signature sequence. The amino acid sequence similarity 

between the signature sequences of orthologous polypeptides is preferably 80%, more 
preferably 90%, and still more preferably, 95%. The ortholog polynucleotides encode 
polypeptides that have similar functional characteristics (e.g., fatty acid transport 
activity) and similar tissue distribution, as appropriate to the organism from which the 

30 ortholog polynucleotides can be isolated. 
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Ortholog polynucleotides can be isolated from (e.g., by cloning or nucleic acid 
amplification methods) a great number of species, as shown by the sample of FATPs 
from evolutionarily divergent species described herein (see, e.g., Figures 44A-C 
through Figure 89). Ortholog polynucleotides corresponding to those in Figure 45 
5 (SEQ ID NO:47), Figure 47 (SEQ ID NO:49), Figures 1 1 1 A-B (SEQ ID NO:l 16), 
Figure 51 (SEQ ID NO:53), Figure 52 (SEQ ID NO:55) and Figure 55 (SEQ ED 
NO:57) are those which can be isolated from mammals such as rat, dog, chimpanzee, 
monkey, baboon, pig, rabbit and guinea pig, for example. 

Further variants that are fragments of the nucleic acids of the invention may 
10 be used to synthesize full-length nucleic acids of the invention, such as by use as 
primers in a polymerase chain reaction. As used herein, the term primer refers to a 
single-stranded oligonucleotide which acts as a point of initiation of template- 
directed DNA synthesis under appropriate conditions (e.g., in the presence of four 
different nucleoside triphosphates and an agent for polymerization, such as DNA or 
15 RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable 
temperature. The appropriate length of a primer depends on the intended use of the 
primer, but typically ranges from 15 to 30 nucleotides. Short primer molecules 
generally require cooler temperatures to form sufficiently stable hybrid complexes 
with the template. A primer need not reflect the exact sequence of the template, but 
20 must be sufficiently complementary to hybridize with a template. The term primer 
site refers to the area of the target DNA to which a primer hybridizes. The term 
primer pair refers to a set of primers including a 5 ? (upstream) primer that hybridizes 
with the 5' end of the DNA sequence to be amplified and a 3 r (downstream) primer 
that hybridizes with the complement of the 3' end of the sequence to be amplified. 
25 Further embodiments of the invention are nucleic acids that are at least 80% 

identical over their entire length to a nucleic acid described herein, for example a 
nucleic acid having the nucleotide sequence in Figures 44A-44C (SEQ ID NO:46), 
Figures 46A-46B (SEQ ID NO:48), Figures 1 1 1A-B (SEQ ID NO:l 16), Figures 50A- 
50C (SEQ ID NO:52), Figure 52 (SEQ ID NO:54), and Figures 54A-54C (SEQ ID 
30 NO:56). Additional embodiments are nucleic acids, and the complements of such 
nucleic acids, having at least 90% nucleotide sequence identity to the above- 
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described sequences, and nucleic acids having at least 95% nucleotide sequence 
identity. In preferred embodiments, DNA of the present invention has 97% 
nucleotide sequence identity, 98% nucleotide sequence identity, or at least 99% 
nucleotide sequence identity with the DNA whose sequences are presented herein. 
5 Other embodiments of the invention are nucleic acids that are at least 80% 

identical in nucleotide sequence to a nucleic acid encoding a polypeptide having an 
amino acid sequence as set forth in Figure 45 (SEQ ID NO:47), Figure 47 (SEQ ED 
NO:49), Figure 112 (SEQ ID NO: 11 7), Figure 51 (SEQ ED NO:53), Figure 53 (SEQ 
ID NO:55) or Figure 55 (SEQ ED NO:57), or as such amino acid sequences are set 

10 forth elsewhere herein, and nucleic acids that are complementary to such nucleic 
acids. Specific embodiments are nucleic acids having at least 90% nucleotide 
sequence identity to a nucleic acid encoding a polypeptide having an amino acid 
sequence as described in the list above, nucleic acids having at least 95% sequence 
identity, and nucleic acids having at least 97% sequence identity. 

15 The terms "complementary" or "complementarity" as used herein, refer to the 

natural binding of polynucleotides under permissive salt and temperature conditions 
by base-pairing. Complementarity between two single-stranded molecules may be 
"partial" in which only some of the nucleic acids bind, or it may be complete when 
total complementarity exists between the single-stranded molecules (that is, when A- 

20 T and G-C base pairing is 100% complete). The degree of complementarity between 
nucleic acid strands has significant effects on the efficiency and strength of 
hybridization between nucleic acid strands. This is of particular importance in 
amplification reactions, which depend on binding between nucleic acid strands. 
The invention further includes nucleic acids that hybridize to the above- 

25 described nucleic acids, especially those nucleic acids that hybridize under stringent 
hybridization conditions. "Stringent hybridization conditions" or "high stringency 
conditions" generally occur within a range from about T m minus 5°C (5° C below the 
strand dissociation temperature or melting temperature (T m ) of the probe nucleic acid 
molecule) to about 20° C to 25° C below T m . As will be understood by those of skill 

30 in the art, the stringency of hybridization may be altered in order to identify or detect 
molecules having identical or related polynucleotide sequences. An example of high 
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stringency hybridization follows. Hybridization solution is (6x SSC/10 mM 
EDT A/0.5% SDS/5x Denhardt's solution/100 |-Lg/ml sheared and denatured salmon 
sperm DNA). Hybridization is at 64-65°C for 16 hours. The hybridized blot is 
washed two times with 2x SSC/0.5% SDS solution at room temperature for 15 
5 minutes each, and two times with 0.2x SSC/0.5% SDS at 65°C, for one hour each. 
Further examples of high stringency conditions can be found on pages 2.10.1-2.10.16 
(see particularly 2.10.8-1 1) and pages 6.3.1-6 in Current Protocols in Molecular 
Biology (Ausubel, F.M. et aL, eds., containing supplements up through Supplement 
42, 1998). Examples of high, medium, and low stringency conditions can be found 

10 on pages 36 and 37 of WO 98/40404, which are incorporated herein by reference. 

The invention further relates to nucleic acids obtainable by screening an 
appropriate library with a probe having a nucleotide sequence such as that set forth in 
Figures 44A-44C (SEQ ID NO:46), Figures 46A-46B (SEQ ID NO:48), Figure 1 1 1 
(SEQ ID NO: 116), Figures 50A-50C (SEQ ID NO:52), Figure 52 (SEQ ID NO:54) or 

15 Figures 54A-54C (SEQ ID NO:56), or a probe which is a sufficiently long fragment 
of any of the above; and isolating the nucleic acid. Such probes generally can 
comprise at least 15 nucleotides. Nucleic acids obtainable by such screenings may 
include RNAs, cDNAs and genomic DNA, for example, encoding FATPs of the 
FATP family described herein. 

20 Further uses for the nucleic acid molecules of the invention, whether encoding 

a full-length FATP or whether comprising a contiguous portion of a nucleic acid 
molecule such as one given in SEQ ID NO:46, 48, 1 16, 52, 54, or 56, include use as 
markers for tissues in which the corresponding protein is preferentially expressed (to 
identify constitutively expressed proteins or proteins produced at a particular stage of 

25 tissue differentiation or stage of development of a disease state); as molecular weight 
markers on southern gels; as chromosome markers or tags (when labeled, for example 
with biotin, a radioactive label or a fluorescent label) to identify chromosomes or to 
map related gene positions; to compare with endogenous DNA sequences in a 
mammal to identify potential genetic disorders; as probes to hybridize and thus 

30 identify, related DNA sequences; as a source of information to derive PCR primers 
for genetic fingerprinting; as a probe to "subtract-out" known sequences in the 
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process of discovering other novel nucleic acid molecules; for selecting and making 
oligomers for attachment to a "gene chip" or other support, to be used, for example, 
for examination of expression patterns; to raise anti-protein antibodies using DNA 
immunization techniques; and as an antigen to raise anti-DNA antibodies or to elicit 
5 another immune response. 

In certain embodiments, a contiguous portion can be about 15, 25, 30, 40, 50, 
75, 1 00, 200, 300, 400, 500, 750, 1 000, 1 1 00, 1 250, 1 500 or more nucleotides in 
length. In a particular embodiment, the contiguous portion encompasses the 
signature sequence of a FATP and is about 1080 nucleotides in length. 

10 Further methods to obtain nucleic acids encoding FATPs of the FATP family 

include PCR and variations thereof (e.g., "RACE" PCR and semi-specific PCR 
methods). Portions of the nucleic acids having a nucleotide sequence set forth in 
Figures 44A-44C (SEQ ID NO:46), Figures 46A-46B (SEQ ID NO:48), Figures 
1 1 1A-B (SEQ ID NO:l 16), Figures 50A-50C (SEQ ID NO:52), Figure 52 (SEQ ID 

15 NO:54) or Figures 54A-54C (SEQ ID NO:56), (especially "flanking sequences" on 
either side of a coding region) can be used as primers in methods using the 
polymerase chain reaction, to produce DNA from an appropriate template nucleic 
acid. 

Once a fragment of the FATP gene is generated by PCR, it can be sequenced, 
20 and the sequence of the product can be compared to other DNA sequences, for 
example, by using the BLAST Network Service at the National Center for 
Biotechnology Information. The boundaries of the open reading frame can then be 
identified using semi-specific PCR or other suitable methods such as library 
screening. Once the 5' initiator methionine codon and the 3' stop codon have been 
25 identified, a PCR product encoding the full-length gene can be generated using 

genomic DNA as a template, with primers complementary to the extreme 5' and 3' 
ends of the gene or to their flanking sequences. The full-length genes can then be 
cloned into expression vectors for the production of functional proteins. 

The invention also relates to isolated proteins or polypeptides such as those 
30 encoded by nucleic acids of the present invention. Isolated proteins can be purified 
from a natural source or can be made recombinantly. Proteins or polypeptides 
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referred to herein as "isolated 11 are proteins or polypeptides that exist in a state 
different from the state in which they exist in cells in which they are normally 
expressed in an organism, and include proteins or polypeptides obtained by methods 
described herein, similar methods or other suitable methods, and also include 
5 essentially pure proteins or polypeptides, proteins or polypeptides produced by 
chemical synthesis or by combinations of biological and chemical methods, and 
recombinant proteins or polypeptides which are isolated. Thus, the term "isolated" as 
used herein, indicates that the polypeptide in question exists in a physical milieu 
distinct from that in which it occurs in nature. Thus, "isolated" includes existing in 
10 membrane fragments and vesicles membrane fractions, liposomes, lipid bilayers and 
other artificial membrane systems. An isolated FA TP may be substantially isolated 
with respect to the complex cellular milieu in which it naturally occurs, and may even 
be purified essentially to homogeneity, for example as determined by PAGE or 
column chromatography (for example, HPLC), but may also have further cofactors or 
15 molecular stabilizers, such as detergents, added to the purified protein to enhance 
activity. In one embodiment, proteins or polypeptides are isolated to a state at least 
about 75% pure; more preferably at least about 85% pure, and still more preferably at 
least about 95% pure, as determined by Coomassie blue staining of proteins on SDS- 
polyacrylamide gels. Proteins or polypeptides referred to herein as "recombinant" are 
20 proteins or polypeptides produced by the expression of recombinant nucleic acids. 

In a preferred embodiment, an isolated polypeptide comprising a FATP, a 
functional portion thereof, or a functional equivalent of the FATP, has at least one 
function characteristic of a FATP, for example, transport activity, binding function 
(e.g., a domain which binds to AMP), or antigenic function (e.g., binding of 
25 antibodies that also bind to a naturally-occurring FATP, as that function is found in 
an antigenic determinant). Functional equivalents can have activities that are 
quantitatively similar to, greater than, or less than, the reference protein. These 
proteins include, for example, naturally occurring FATPs that can be purified from 
tissues in which they are produced (including polymorphic or allelic variants), 
30 variants (e.g., mutants) of those proteins and/or portions thereof. Such variants 
include mutants differing by the addition, deletion or substitution of one or more 



WO 01/21795 



PCT/USOO/25891 



-34- 

amino acid residues, or modified polypeptides in which one or more residues are 
modified, and mutants comprising one or more modified residues. Portions or 
fragments of a FATP can range in size from four amino acid residues to the entire 
amino acid sequence minus one amino acid and include contiguous portions or 
5 fragments about 4, 5, 6, 7, 8, 9, 10, 15, 25, 30, 40, 50, 75, 100, 150, 200, 300, 400, 
500, 600 or more amino acid residues in length. In one particular embodiment, the 
portion or fragment includes the signature sequence of a FATP polypeptide and is 
about 360 amino acid residues in length. 

The isolated proteins of the invention preferably include mammalian fatty 

10 acid transport proteins of the FATP family of homologous proteins. In one 

embodiment, the extent of amino acid sequence similarity between a polypeptide 
having one of the amino acid sequences shown in Figure 45 (SEQ ID NO:47), Figure 
47 (SEQ ID NO:49), Figure 1 12 (SEQ ID NO:l 17), Figure 51 (SEQ ID NO:53), 
Figure 53 (SEQ ID NO:55), or Figure 55 (SEQ ID NO:57), and the respective 

15 functional equivalents of these polypeptides is at least about 88%. In other 

embodiments, the degree of amino acid sequence similarity between a FATP and its 
respective functional equivalent is at least about 91%, at least about 94%, or at least 
about 97%. 

The polypeptides of the invention also include those FATPs encoded by 
20 polynucleotides which are orthologous to those polynucleotides, the sequences of 
which are described herein in whole or in part. FATPs which are orthologs to those 
described herein by amino acid sequence, in whole or in part, are, for example, fatty 
acid transport proteins 1-6 of dog, rat, chimpanzee, monkey, rabbit, guinea pig, 
baboon and pig, and are also embodiments of the invention. 
25 To determine the percent identity or similarity of two amino acid sequences or 

of two nucleic acid sequences, the sequences are aligned for optimal comparison 
purposes (e.g., gaps can be introduced in one or both of a first and a second amino 
acid or nucleic acid sequence for optimal alignment, and non-homologous 
(dissimilar) sequences can be disregarded for comparison purposes). In a preferred 
30 embodiment, the length of a reference sequence aligned for comparison purposes is at 
least 30%, preferably at least 40%, more preferably at least 50%, even more 
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preferably at least 60%, and even more preferably at least 70%, 80%, or 90% of the 
length of the reference sequence. The amino acid residues or nucleotides at 
corresponding amino acid positions or nucleotide positions are then compared. When 
a position in the first sequence is occupied by the same amino acid residue or 
5 nucleotide as the corresponding position in the second sequence, then the molecules 
are identical at that position (as used herein, amino acid or nucleic acid "identity" is 
equivalent to amino acid or nucleic acid "similarity"). The percent identity between 
the two sequences is a function of the number of identical positions shared by the 
sequences, taking into account the number of gaps, and the length of each gap, which 

10 need to be introduced for optimal alignment of the two sequences. 

The invention also encompasses polypeptides having a lower degree of 
identity but having sufficient similarity so as to perform one or more of the same 
functions performed by the polypeptides described herein by amino acid sequence. 
Similarity for a polypeptide is determined by conserved amino acid substitution. 

15 Such substitutions are those that substitute a given amino acid in a polypeptide by 
another amino acid of like characteristics. Conservative substitutions are likely to be 
phenotypically silent. Typically seen as conservative substitutions are the 
replacements, one for another, among the aliphatic amino acids Ala, Val, Leu, and 
He; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic residues 

20 Asp and Glu, substitution between the amide residues Asn and Gin, exchange of the 
basic residues Lys and Arg and replacements among the aromatic residues Phe and 
Tyr. Guidance concerning which amino acid changes are likely to be phenotypically 
silent is found in Bowie et al, Science 247:1306-1310 (1990). 
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TABLE 1 . Conservative Amino Acid Substitutions 



5 



Aromatic 




Phenylalanine 








Tryptophan 








Tyrosine 




Hydrophobic 




Leucine 








Isoleucine 








Valine 




Polar 




Glutamine 








Asparagine 




Basic 




Arginine 








Lysine 








Histidine 




Acidic 




Aspartic Acid 








Glutamic Acid 




Small 




Alanine 








Serine 








Threonine 








Methionine 








Glycine 





The comparison of sequences and determination of percent identity and 
similarity between two sequences can be accomplished using a mathematical 

1 0 algorithm. {Computational Molecular Biology, Lesk, A.M.,ed., Oxford University 
Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, 
D.W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, 
Part 7, Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 1994; 
Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and 

15 Sequence Analysis Primer, Gribskov, M. and Devereaux, J. 3 eds., M. Stockton 
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Press, New York, 1991). In a preferred embodiment, the percent identity between 
two amino acid sequences is determined using the Needleman and Wunsch (J. Mol 
Biol (48):444-453 (1970)) algorithm which has been incorporated into the GAP 
program in the GCG software package (available at http://www.gcg.com), using 
5 either a Blossom 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 
8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred 
embodiment, the percent identity between two nucleotide sequences is determined 
using the GAP program in the GCG software package (Devereux, J., et al, Nucleic 
Acids Res. J2(l):387 (1984)) (available at http://www.gcg.com), using a 
10 NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length 

weight of 1, 2, 3, 4, 5, or 6. In another embodiment, the percent identity between two 
amino acid or nucleotide sequences is determined using the algorithm of E. Meyers 
and W. Miller (CABIOS, 4:1 1-17 (1989)) which has been incorporated into the 
ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length 
15 penalty of 12 and a gap penalty of 4. 

The nucleic acid and protein sequences of the present invention can further be 
used as a "query sequence" to perform a search against databases to, for example, 
identify other family members or related sequences. Such searches can be performed 
using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al (J. Mol 
20 Biol 275:403-10 (1990)). BLAST nucleotide searches can be performed with the 
NBLAST program, score = 100, word length = 12 to obtain nucleotide sequences 
homologous to (with calculatably significant similarity to) the nucleic acid molecules 
of the invention. BLAST protein searches can be performed with the XBLAST 
program, score = 50, word length = 3 to obtain amino acid sequences homologous to 
25 the proteins of the invention. To obtain gapped alignments for comparison purposes, 
Gapped BLAST can be utilized as described in Altschul et al, (Nucleic Acids Res. 
25(17):3389-3402 (1997)). When utilizing BLAST and gapped BLAST programs, 
the default parameters of the respective programs (e.g., XBLAST and NBLAST) can 
be used. See http://www.ncbi.nlm.nih.gov. Similarity for nucleotide and 

30 amino acid sequences can be defined in terms of the parameters set by the Advanced 
Blast search available from NCBI (the National Center for Biotechnology 
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Information; see, for Advanced BLAST page, www.ncbi.nlm.nih.gov/cgi- 
bin/BLAST/nph-newblast?Jform=l). These default parameters, recommended for a 
query molecule of length greater than 85 amino acid residues or nucleotides have 
been set as follows: gap existence cost, 1 1, per residue gap cost, 1; lambda ratio, 
5 0.85. Further explanation of version 2.0 of BLAST can be found on related website 
pages and in Altschul, S.F. et al t Nucleic Acids Res. 25:3389-3402 (1997). 

In certain embodiments, a contiguous portion can be about 4, 5, 6, 7, 8, 9, 10, 
15, 25, 30, 40, 50, 75, 100, 150, 200, 300, 400, 500, 600 or more amino acid residues 
in length. In one particular embodiment, the portion or fragment includes the 
10 signature sequence of a FA TP polypeptide and is about 360 amino acid residues in 
length. 

The invention further relates to fusion proteins, comprising a FATP or 
functional portion thereof (as described above) as a first moiety, linked to a second 
moiety not occurring in the FATP as found in nature. Thus, the second moiety can be 

15 an amino acid, peptide or polypeptide. The first moiety can be in an N-terminal 
location, C-terminal location or internal to the fusion protein. In one embodiment, 
the fusion protein comprises a FATP as the first moiety, and a second moiety 
comprising a linker sequence and an affinity ligand. Fusion proteins can be produced 
by a variety of methods. For example, a fusion protein can be produced by the 

20 insertion of a FATP gene or portion thereof into a suitable expression vector, such as 
Bluescript SK +/- (Stratagene, La Jolla, CA), pGEX-4T-2 (Pharmacia, Peapack, NJ), 
pET-24(+) (Novagen, Madison, WI), or vectors of similar construction. The resulting 
construct can be introduced into a suitable host cell for expression. Upon expression, 
fusion protein can be purified from cells by means of a suitable affinity matrix (See 

25 e,g., Current Protocols in Molecular Biology, Ausubel, F.M. et aL, eds., Vol. 2, pp. 
16.4.1-16.7.8, containing supplements up through Supplement 42, 1998). 

The invention also relates to enzymatically produced, synthetically produced, 
or recombinantly produced portions of a fatty acid transport protein. Portions of a 
FATP can be made which have full or partial function on their own, or which when 

30 mixed together (though fully, partially, or nonfunctional alone), spontaneously 
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assemble with one or more other polypeptides to reconstitute a functional protein 
having at least one function characteristic of a FATP. 

Fragments of a FATP can be produced by direct peptide synthesis, for 
example those using solid-phase techniques (Roberge, J.Y. et al, Science 269:202- 
5 204 (1995); Merrifield, J., J. Am. Chem. Soc. 55:2149-2154 (1963)). Protein 

synthesis can be performed using manual techniques or by automation. Automated 
synthesis can be carried out using, for instance, an Applied Biosystems 431 A Peptide 
Synthesizer (Perkin Elmer). Various fragments of a FATP can be synthesized 
separately and combined using chemical methods. 

1 0 One aspect of the invention is a peptide or polypeptide having the amino acid 

sequence of a portion of a fatty acid transport protein which is hydrophilic rather than 
hydrophobic, and ordinarily can be detected as facing the outside of the cell 
membrane. Such a peptide or polypeptide can be thought of as being an extracellular 
domain of the FATP, or a mimetic of said extracellular domain. It is known, for 

1 5 example, that a portion of human FATP4 that includes a highly conserved motif is 
involved in AMP-CoA binding function (Stuhlsatz-Krouper, S.M. et al, J. Biol 
Chem. 44:28642-28650 (1998)). 

The term "mimetic" as used herein, refers to a molecule, the structure of 
which is developed from knowledge of the structure of the FATP of interest, or one 

20 or more portions thereof, and, as such, is able to effect some or all of the functions of 
a FATP. 

Portions of a FATP can be prepared by enzymatic cleavage of the isolated 
protein, or can be made by chemical synthesis methods. Portions of a FATP can also 
be made by recombinant DNA methods in which restriction fragments, or fragments 

25 that may have undergone further enzymatic processing, or synthetically made DNAs 
are joined together to construct an altered FATP gene. The gene can be made such 
that it encodes one or more desired portions of a FATP. These portions of FATP can 
be entirely homologous to a known FATP, or can be altered in amino acid sequence 
relative to naturally occurring FATPs to enhance or introduce desired properties such 

30 as solubility, stability, or affinity to a ligand. A further feature of the gene can be a 
sequence encoding an N-terminal signal peptide directed to the plasma membrane. 
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An extracellular domain can be determined by a hydrophobicity plot, such as 
those shown in Figures 28A, 29A, and 35A, or by a hydrophilicity plot such as those 
shown in Figures 28C, 29C, 35C, 91, 92 and 93. A polypeptide or peptide 
comprising all or a portion of a FATP extracellular domain can be used in a 
5 pharmaceutical composition. When administered to a mammal by an appropriate 
route, the polypeptide or peptide can bind to fatty acids and compete with the native 
FATPs in the membrane of cells, thereby making fewer fatty acid molecules available 
as substrates for transport into cells, and reducing the amount of fatty acids taken up 
by, for example, the heart, in the case of FATP6. 

10 Another aspect of the invention relates to a method of producing a fatty acid 

transport protein, variants or portions thereof, and to expression systems and host 
cells containing a vector appropriate for expression of a fatty acid transport protein. 

Cells that express a FATP, a variant or a portion thereof, or an ortholog of a 
FATP described herein by amino acid sequence, can be made and maintained in 

15 culture, under conditions suitable for expression, to produce protein in the cells for 
cell-based assays, or to produce protein for isolation. These cells can be procaryotic 
or eucaryotic. Examples of procaryotic cells that can be used for expression include 
Escherichia coli, Bacillus subtilis and other bacteria. Examples of eucaryotic cells 
that can be used for expression include yeasts such as Saccharomyces cerevisiae, 

20 Schizosaccharomyces pombe, Pichia pastoris and other lower eucaryotic cells, and 
cells of higher eucaryotes such as those from insects and mammals, such as primary 
cells and cell lines such as CHO, HeLa, 3T3 and BHK cells, preferably COS cells and 
human kidney 293 cells, and more preferably Jurkat cells. (See, e.g., Ausubel, F.M. 
et al t eds. Current Protocols in Molecular Biology, Greene Publishing Associates 

25 and John Wiley & Sons, Inc., containing Supplements up through Supplement 42, 
1998)). 

In one embodiment, host cells that produce a recombinant FATP, or a portion 
thereof, a variant, or an ortholog of a FATP described herein by amino acid sequence, 
can be made as follows. A gene encoding a FATP, variant or a portion thereof can be 
30 inserted into a nucleic acid vector, e.g., a DNA vector, such as a plasmid, phage, 

cosmid, phagemid, virus, virus-derived vector (e.g., SV40, vaccinia, adenovirus, fowl 
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pox virus, pseudorabies viruses, retroviruses) or other suitable replicon, which can be 
present in a single copy or multiple copies, or the gene can be integrated in a host cell 
chromosome. A suitable replicon or integrated gene can contain all or part of the 
coding sequence for a FATP or variant, operably linked to one or more expression 
5 control regions whereby the coding sequence is under the control of transcription 
signals and linked to appropriate translation signals to permit translation. The vector 
can be introduced into cells by a method appropriate to the type of host cells (e.g., 
transfection, electroporation, infection). For expression from the FATP gene, the 
host cells can be maintained under appropriate conditions (e.g., in the presence of 

10 inducer, normal growth conditions, etc.). Proteins, or polypeptides thus produced can 
be recovered (e.g., from the cells, as in a membrane fraction, from the periplasmic 
space of bacteria, from culture medium) using suitable techniques. Appropriate 
membrane targeting signals maybe incorporated into the expressed polypeptide. 
These signals may be endogenous to the polypeptide or they may be heterologous 

15 signals. 

Polypeptides of the invention can be recovered and purified from cell cultures 
(or from their primary cell source) by well-known methods including ammonium 
sulfate or ethanol precipitation, acid extraction, anion or cation exchange 
chromatography, phosphocellulose chromatography, hydrophobic interaction 

20 chromatography, affinity chromatography, hydroxylapatite chromatography and high 
performance liquid chromatography. Known methods for refolding protein can be 
used to regenerate active conformation if the polypeptide is denatured during 
isolation or purification. 

In a further aspect of the invention are methods for assessing the transport 

25 function of any of the fatty acid transport proteins or polypeptides described herein, 
including orthologs, and in variations of these, methods for identifying an inhibitor 
(or an enhancer) of such function and methods for assessing the transport function in 
the presence of a candidate inhibitor or a known inhibitor. 

A variety of systems comprising living cells can be used for these methods. 

30 Cells to be used in fatty acid transport assays, and further in methods for identifying 
an inhibitor or enhancer of this function, express one or more FATPs. See Examples 
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3, 6, 9, 12 and 14 for data on tissue distribution of expression of FATPs, and 
Examples 10 and 1 1 describing recombinant cells expressing FATP. Cells for use in 
cell-based assays described herein can be drawn from a variety of sources, such as 
isolated primary cells of various organs and tissues wherein one or more FATPs are 
5 naturally expressed. In some cases, the cells can be from adult organs, and in some 
cases, from embryonic or fetal organs, such as heart, lung, liver, intestine, skeletal 
muscle, kidney and the like. Cells for this purpose can also include cells cultured as 
fragments of organs or in conditions simulating the cell type and/or tissue 
organization of organs, in which artificial materials may be used as substrates for cell 

10 growth. Other types of cells suitable for this purpose include cells of a cell strain or 
cell line (ordinarily comprising cells considered to be "transformed") transfected to 
express one or more FATPs. 

A further embodiment of the invention is a method for detecting, in a sample 
of cells, a fatty acid transport protein, a portion or fragment thereof, a fusion protein 

15 comprising a FATP or a portion thereof, or an ortholog as described herein, wherein 
the cells can be, for instance, cells of a tissue, primary culture cells, or cells of a cell 
line, including cells into which nucleic acid has been introduced. The method 
comprises adding to the sample an agent that specifically binds to the protein, and 
detecting the agent specifically bound to the protein. Appropriate washing steps can 

20 be added to reduce nonspecific binding to the agent. The agent can be, for example, 
an antibody, a ligand or a substrate mimic. The agent can have incorporated into it, 
or have bound to it, covalently or by high affinity non-covalent interactions, for ■ 
instance, a label that facilitates detection of the agent to which it is bound, wherein 
the label can be, but is not limited to, a phosphorescent label, a fluorescent label, a 

25 biotin or avidin label, or a radioactive label. The means of detection of a fatty acid 
transport protein can vary, as appropriate to the agent and label used. For example, 
for an antibody that binds to the fatty acid transport protein, the means of detection 
may call for binding a second antibody, which has been conjugated to an enzyme, to 
the antibody which binds the fatty acid transport protein, and detecting the presence 

30 of the second antibody by means of the enzymatic activity of the conjugated enzyme. 
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Similar principles can also be applied to a cell lysate or a more purified 
preparation of proteins from cells that may comprise a fatty acid transport protein of 
interest, for example in the methods of immunoprecipitation, immunoblotting, 
immunoaffinity methods, that in addition to detection of the particular FATP, can 
5 also be used in purification steps, and qualitative and quantitative immunoassays. 
See, for instance, chapters 1 1 through 14 in Antibodies: A Laboratory Manual, E. 
Harlow and D. Lane, eds., Cold Spring Harbor Laboratory, 1988. 

Isolated fatty acid transport protein or, an antigenically similar portion thereof, 
especially a portion that is soluble, can be used in a method to select and identify 

10 molecules which bind specifically to the FATP. Fusion proteins comprising all of, or 
a portion of, the fatty acid transport protein linked to a second moiety not occurring in 
the FATP as found in nature, can be prepared for use in another embodiment of the 
method. Suitable fusion proteins for this purpose include those in which the second 
moiety comprises an affinity ligand (e.g., an enzyme, antigen, epitope). FATP fusion 

15 proteins can be produced by the insertion of a gene encoding the FATP or a variant 
thereof, or a suitable portion of such gene into a suitable expression vector, which 
encodes an affinity ligand (e.g., pGEX-4T-2 and pET-15b, encoding glutathione S- 
transferase and His-Tag affinity ligands, respectively). The expression vector can be 
introduced into a suitable host cell for expression. Host cells are lysed and the lysate, 

20 containing fusion protein, can be bound to a suitable affinity matrix by contacting the 
lysate with an affinity matrix. In a particular embodiment, a nucleic acid 

encodes a portion of a FATP polypeptide which includes a motif or domain, for 
example, a lipocalin domain or an AMP -binding domain. Such a polypeptide portion 
can be a functional portion of a FATP protein. The term "lipocalin domain" is an art 

25 recognized term and as used herein refers to a particular domain present in FATP 
proteins. This domain is described as including regions of sequence homology as 
well as a common tertiary structure represented as an eight stranded antiparallel beta- 
barrel, (see Banaszak, L. et ai, Advances in Protein Chemistry, 45: 89-151). Many 
lipocalin domains can be identified structurally as a sequence contained within the 

30 general formula: [DENG]-X-[DENQGSTARK]-X(0,2)-[DENQARK]-[LIVFY]- 

{CP}-G-{C}-W-[FYWLRH-X]-[LIVMTA], e.g., the lipocalin signature sequence or 
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consensus pattern (SEQ ID NO: 125). One skilled in the art will recognize that a 
lipocalin domain for a particular FATP protein can vary in sequence from this general 
formula. A FATP lipocalin domain can be, for example, identical to the lipocalin 
signature sequence, or, can exhibit 60, 65, 70, 75, 80, 85, 90, 95 or greater sequence 
5 percent identity in comparison to the general formula provided that it still retains the 
necessary lipocalin binding function. 

For example, a lipocalin domain for each of the human FATPs, hsFATPl 
(SEQ ID NO: 126), hsFATP2 (SEQ ID NO: 127), hsFATP3 (SEQ ID NO: 128^, 
hsFATP4 (SEQ ID NO: 129), hsFATPS (SEQ ID NO: 130), and hsFATP6 (SEQ ID 
10 NO: 131) has been identified. These particular lipocalin domains are located near the 
N-terminal portion of the specified proteins (see Figure 1 1 8). The sequences of these 
lipocalin domains are highly conserved across the FATP family. A search using the 
lipocalin signature sequence conducted on a public database 
f www.ebi.ac.uk/interproyQ . indicated that the lipocalin domains of hsFATPl and 
15 hsFATP4 share identity with signature sequence. In addition, a search directed to 

identifying sequences having at least 80% identity to the lipocalin signature sequence 
identified three additional human FATPs, hsFATP3, hsFATPS and hsFATP6. 

A lipocalin domain can also be identified functionally since, for example, it 
has been identified as a binding motif capable of binding fatty acids. In particular, 
20 the studies described in Experiment 20 demonstrated that fusion proteins including 
the lipocalin domains from hsFATP4 bound long chain fatty acids such as oleates and 
palmitates with great specificity. Other fatty acids can also be used to assess binding 
in FATP4 and other members of the FATP family. 

Polypeptides, including fusion polypeptides, which contain a lipocalin domain 
25 can also include additional components. For example, fusion polypeptides containing 
a lipocalin domain can include amino acid residues from the portion of the protein 
which is located upstream, i. e., in the direction of the N-terminal end of a FATP 
protein, from the lipocalin domain. As the term "upstream sequences" is used herein 
in relation to the lipocalin domain, it is intended to refer to the amino acid residues of 
30 a FATP protein which are located between the signal peptide (when one is present) 
and the lipocalin domain. In the absence of a signal peptide, the term refers to the 
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portion of a FATP protein between the lipocalin domain and the amino terminus (see 
Figure 1 15). 

Fusion polypeptides which contain a lipocalin domain can also include 
additional domains or motifs, for example, an AMP binding domain can be included. 
5 For example, an AMP binding domain for each of the human FATPs, hsFATPl (SEQ 
ID NO: 132), hsFATP2 (SEQ ID NO: 133), hsFATP3 (SEQ ID NO: 134), hsFATP4 
(SEQ ID NO: 135), hsFATPS (SEQ ID NO: 136) and hsFATP6 (SEQ ID NO: 137) 
has been identified (see Figure 118). 

In one embodiment, the fusion protein can be immobilized on a suitable 

10 affinity matrix under conditions sufficient to bind the affinity ligand portion of the 
fusion protein to the matrix, and is contacted with one or more candidate binding 
agents (e.g., a mixture of peptides) to be tested, under conditions suitable for binding 
of the binding agents to the FATP portion of the bound fusion protein. Next, the 
affinity matrix with bound fusion protein can be washed with a suitable wash buffer 

15 to remove unbound candidate binding agents and non-specifically bound candidate 
binding agents. Those agents which remain bound can be released by contacting the 
affinity matrix with fusion protein bound thereto with a suitable elution buffer. Wash 
buffer can be formulated to permit binding of the fusion protein to the affinity matrix, 
without significantly disrupting binding of specifically bound binding agents. In this 

20 aspect, elution buffer can be formulated to permit retention of the fusion protein by 
the affinity matrix, but can be formulated to interfere with binding of the candidate 
binding agents to the target portion of the fusion protein. For example, a change in 
the ionic strength or pH of the elution buffer can lead to release of specifically bound 
agent, or the elution buffer can comprise a release component or components 

25 designed to disrupt binding of specifically bound agent to the target portion of the 
fusion protein. 

Immobilization can be performed prior to, simultaneous with, or after, 
contacting the fusion protein with candidate binding agent, as appropriate. Various 
permutations of the method are possible, depending upon factors such as the 
30 candidate molecules tested, the affinity matrix-ligand pair selected, and elution buffer 
formulation. For example, after the wash step, fusion protein with binding agent 
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molecules bound thereto can be eluted from the affinity matrix with a suitable elution 
buffer (a matrix elution buffer, such as glutathione for a GST fusion). Where the 
fusion protein comprises a cleavable linker, such as a thrombin cleavage site, 
cleavage from the affinity ligand can release a portion of the fusion with the candidate 
5 agent bound thereto. Bound agent molecules can then be released from the fusion 
protein or its cleavage product by an appropriate method, such as extraction. 

One or more candidate binding agents can be tested simultaneously. Where a 
mixture of candidate binding agents is tested, those found to bind by the foregoing 
processes can be separated (as appropriate) and identified by suitable methods (e.g., 

10 PCR, sequencing, chromatography). Large libraries of candidate binding agents (e.g., 
peptides, RNA oligonucleotides) produced by combinatorial chemical synthesis or by 
other methods can be tested (see e.g., Ohlmeyer, M.H J. et al., Proc. Natl. Acad. Sci. 
USA 90:10922-10926 (1993) and DeWitt, S.H. et al, Proc. Natl Acad. Sci. USA 
90:6909-6913 (1993), relating to tagged compounds; see also Rutter, W.J. et al U.S. 

15 Patent No. 5,010,175; Huebner, V.D. et al, U.S. Patent No. 5,1 82,366; and Geysen, 
H.M., U.S. Patent No. 4,833,092). Random sequence RNA libraries (see Ellington, 
A.D. et al, Nature 34(5:818-822 (1990); Bock, L.C. et al, Nature 355:584-566 
(1992); and Szostak, J.W., Trends in Biochem. Sci. 7 7:89-93 (March, 1992)) can also 
be screened according to the present method to select RNA molecules which bind to a 

20 target FATP or FATP fusion protein. Where binding agents selected from a 
combinatorial library by the present method cany unique tags, identification of 
individual biomolecules by chromatographic methods is possible. Where binding 
agents do not carry tags, chromatographic separation, followed by mass spectrometry 
to ascertain structure, can be used to identify binding agents selected by the method, 

25 for example. 

The invention also comprises a method for identifying an agent which inhibits 
interaction between a fatty acid transport protein (e.g., one comprising the amino acid 
sequence in SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:l 17, SEQ ID NO:53, SEQ 
ID NO:55, or SEQ ED NO:57), and a ligand of said protein. The FATP can be one 
30 described by an amino acid sequence herein, a portion or fragment thereof, a variant 
thereof, or an ortholog thereof, or a FATP fusion protein. Here, a ligand can be, for 
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instance, a substrate, or a substrate mimic, an antibody, or a compound, such as a 
peptide, that binds with specificity to a site on the protein. The method comprises 
combining, not limited to a particular order, the fatty acid protein, the ligand of the 
protein, and a candidate agent to be assessed for its ability to inhibit interaction 
5 between the protein and the ligand, under conditions appropriate for interaction 
between the protein and the ligand (e.g., pH, salt, temperature conditions conducive 
to appropriate conformation and molecular interactions); determining the extent to 
which the protein and ligand interact; and comparing (1) the extent of protein-ligand 
interaction in the presence of candidate agent with (2) the extent of protein-ligand 

10 interaction in the absence of candidate agent, wherein if (1) is less than (2), then the 
candidate agent is one which inhibits interaction between the protein and the ligand. 

The method can be facilitated, for example, by using an experimental system 
which employs a solid support (column chromatography matrix, wall of a plate, 
microtiter wells, column pore glass, pins to be submerged in a solution, beads, etc.) 

15 to which the protein can be attached. Accordingly, in one embodiment, the protein 
can be fixed to a solid phase directly or indirectly, by a linker. The candidate agent to 
be tested is added under conditions conducive for interaction and binding to the 
protein. The ligand is added to the solid phase system under conditions appropriate 
for binding. Excess ligand is removed, as by a series of washes done under 

20 conditions that do not disrupt protein-ligand interactions. Detection of bound ligand 
can be facilitated by using a ligand that carries a label (e.g., fluorescent, 
cherniluminescent, radioactive). In a control experiment, protein and ligand are 
allowed to interact in the absence of any candidate agent, under conditions otherwise 
identical to those used for the "test" conditions where candidate inhibiting agent is 

25 present, and any washes used in the test conditions are also used in the control. The 
extent to which ligand binds to the protein in the presence of candidate agent is 
compared to the extent to which ligand binds to the protein in the absence of the 
candidate agent. If the extent to which interaction of the protein and the ligand 
occurs is less in the presence of the candidate agent than in the absence of the 

30 candidate agent, the candidate agent is an agent which inhibits interaction between 
the protein and the ligand of the protein. 
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In a further embodiment, an inhibitor (or an enhancer) of a fatty acid transport 
protein can be identified. The method comprises steps which are, or are variations of 
the following: contacting the cells with fatty acid, wherein the fatty acid can be 
labeled for convenience of detection; contacting a first aliquot of the cells with an 
5 agent being tested as an inhibitor (or enhancer) of fatty acid uptake while maintaining 
a second aliquot of cells under the same conditions but without contact with the 
agent; and measuring (e.g., quantitating) fatty acid in the first and second aliquots of 
cells; wherein a lesser quantity of fatty acid in the first aliquot compared to that in 
the second aliquot is indicative that the agent is an inhibitor of fatty acid uptake by a 

10 fatty acid transport protein. A greater quantity of fatty acid in the first aliquot 

compared to that in the second aliquot is indicative that the agent is an enhancer of 
fatty acid uptake by a fatty acid transport protein. 

A particular embodiment of identifying an inhibitor or enhancer of fatty acid 
transport function employs the above steps, but also employs additional steps 

15 preceding those given above: introducing into cells of a cell strain or cell line ("host 
cells" for the intended introduction of, or after the introduction of, a vector) a vector 
comprising a fatty acid transport protein gene, wherein expression of the gene can be 
regulatable or constitutive, and providing conditions to the -host cells under which 
expression of the gene can occur. 

20 The terms "contacting" and "combining 11 as used herein in the context of 

bringing molecules into close proximity to each other, can be accomplished by 
conventional means. For example, when referring to molecules that are soluble, 
contacting is achieved by adding the molecules together in a solution. "Contacting" 
can also be adding an agent to a test system, such as a vessel containing cells in tissue 

25 culture. 

The term "inhibitor" or "antagonist", as used herein, refers to an agent which 
blocks, diminishes, inhibits, hinders, limits, decreases, reduces, restricts or interferes 
with fatty acid transport into the cytoplasm of a cell, or alternatively and additionally, 
prevents or impedes the cellular effects associated with fatty acid transport. The term 
30 "enhancer" or "agonist", as used herein, refers to an agent which augments, enhances, 
or increases fatty acid transport into the cytoplasm of a cell. An antagonist will 
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decrease fatty acid concentration, fatty acid metabolism and byproduct levels in the 
cell, leading to phenotypic and molecular changes. 

In order to produce a "host cell' 1 type suitable for fatty acid uptake assays and 
for assays derived therefrom for identifying inhibitors or enhancers thereof, a nucleic 
5 acid vector can be constructed to comprise a gene encoding a fatty acid transport 
protein, for example, human F ATP 1 , FATP2, FATP3, FATP4, FATP5, FATP6, a 
mutant or variant thereof, an ortholog of the human proteins, such as mouse orthologs 
or orthologs found in other mammals, or a FATP family protein of origin in an 
organism other than a mammal. The gene of the vector can be regulatable, such as by 

10 the placement of the gene under the control of an inducible or repressible promoter in 
the vector (e.g., inducible or repressible by a change in growth conditions of the host 
cell harboring the vector, such as addition of inducer, binding or functional removal 
of repressor from the cell millieu, or change in temperature) such that expression of 
the FATP gene can be turned on or initiated by causing a change in growth 

15 conditions, thereby causing the protein encoded by the gene to be produced, in host 
cells comprising the vector, as a plasma membrane protein. Alternatively, the FATP 
gene can be constitutively expressed. 

A vector comprising a FATP gene, such as a vector described herein, can be 
introduced into host cells by a means appropriate to the vector and to the host cell 

20 type. For example, commonly used methods such as electroporation, transfection, 
for instance, transfection using CaCl 2 , and transduction (as for a virus or 
bacteriophage) can be used. Host cells can be, for example, mammalian cells such as 
primary culture cells or cells of cell lines such as COS cells, 293 cells or Jurkat cells. 
Host cells can also be, in some cases, cells derived from insects, cells of insect cell 

25 lines, bacterial cells, such as E. coli, or yeast cells, such as S. cerevisiae. It is 

preferred that the fatty acid transport protein whose function is to be assessed, with or 
without a candidate inhibitor or enhancer, be produced in host cells whose ancestor 
cells originated in a species related to the species of origin of the FATP gene 
encoding the fatty acid transport protein. For example, it is preferable that tests of 

30 function or of inhibition or enhancement of a mammalian FATP be carried out in host 
mammalian cells producing the FATP, rather than bacterial cells or yeast cells. 
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Host cells comprising a vector comprising a regulatable FATP gene can be 
treated so as to allow expression of the FATP gene and production of the encoded 
protein (e.g., by contacting the cells with an inducer compound that effects 
transcription from an inducible promoter operably linked to the FATP gene). 
5 Alternatively, host cells containing an endogenous FATP gene can be 

engineered to activate or deactivate expression of the FATP gene and production of 
the encoded protein. For example, homologous recombination, often referred to as 
targeting, can be utilized to alter the regulatory region associated with the FATP gene 
to increase or decrease the level of expression. Alteration of the regulatory. region 
10 can include disablement of the regulatory region associated with the FATP gene 
and/or replacement of the region or a portion of the region. A variety of regulatory 
regions are known which can be transfected into cells to cause an endogenous gene to 
display a pattern of induction or expression that' differs from that of the cell prior to 
transfection. 

1 5 The test agent (e.g., an agonist or antagonist) is added to the cells to be used 

in a fatty acid transport assay, in the presence or absence of test agent, under 
conditions suitable for production and/or maintenance of the expressed FATP in a 
conformation appropriate for association of the FATP with test agent and substrate. 
For example, conditions under which an agent is assessed, such as media and 

20 temperature requirements, can, initially, be similar to those necessary for transport of 
typical fatty acid substrates across the plasma membrane. One of ordinary skill in the 
art will know how to vary experimental conditions depending upon the biochemical 
nature of the test agent. The test agent can be added to the cells in the presence of 
fatty acid, or in the absence of fatty acid substrate, with the fatty acid substrate being 

25 added following the addition of the test agent. The concentration at which the test 
agent can be evaluated can be varied, as appropriate, to test for an increased effect 
with increasing concentrations. 

Test agents to be assessed for their effects on fatty acid transport can be any 
chemical (element, molecule, compound), made synthetically, made by recombinant 

30 techniques or isolated from a natural source. For example, test agents can be 

peptides, polypeptides, peptoids, sugars, hormones, or nucleic acid molecules, such as 
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antisense nucleic acid molecules. In addition, test agents can be small molecules or 
molecules of greater complexity made by combinatorial chemistry, for example, and 
compiled into libraries. These libraries can comprise, for example, alcohols, alkyl 
halides, amines, amides, esters, aldehydes, ethers and other classes of organic 
5 compounds. Test agents can also be natural or genetically engineered products 
isolated from lysates of cells, bacterial, animal or plant, or can be the cell lysates 
themselves. Presentation of test compounds to the test system can be in either an 
isolated form or as mixtures of compounds, especially in initial screening steps. 

Thus, the invention relates to a method for identifying agents which alter fatty 
10 acid transport, the method comprising providing the test agent to the cell (wherein 
"cell 11 includes the plural, and can include cells of a cell strain, cell line or culture of 
primary cells or organ culture, for example), under conditions suitable for binding to 
its target, whether to the FATP itself or to another target on or in the cell, wherein the 
transformed cell comprises a FATP. 
1 5 In greater detail, to test one or more agents or compounds (e.g., a mixture of 

compounds can conveniently be screened initially) for inhibition of the transport 
function of a fatty acid transport protein, the agent(s) can be contacted with the cells. 
The cells can be contacted with a labeled fatty acid. The fatty acid can be, for 
example, a known substrate of the fatty acid transport protein such as oleate or 
20 palmitate. The fatty acid can itself be labeled with a radioactive isotope, (e.g., 3 H or 
14 C) or can have a radioactively labeled adduct attached. In other variations, the fatty 
acid can have chemically attached to it a fluorescent label, or a substrate for an 
enzyme occurring within the cells, wherein the substrate yields a detectable product, 
such as a highly colored or fluorescent product. Addition of candidate inhibitors and 
25 labeled substrate to the cells comprising fatty acid transport protein can be in either 
order or can be simultaneous. 

A second aliquot of cells, which can be called "control" cells (a "first" aliquot 
of cells can be called "test" cells), is treated, if necessary (as in the case of 
transformed "host"cells), so as to allow expression of the FATP gene, and is 
30 contacted with the labeled substrate of the fatty acid transport protein. The second 
aliquot of cells is not contacted with one or more agents to be tested for inhibition of 
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the transport function of the protein produced in the cells, but is otherwise kept under 
the same culture conditions as the first aliquot of cells. 

In a further step of a method to identify inhibitors of a fatty acid transport 
protein, the labeled fatty acid is measured in the first and second aliquots of cells. A 
5 preliminary step of this measurement process can be to separate the external medium 
from the cells so as to be able to distinguish the labeled fatty acid external to the cells 
from that which has been transported inside the cells. This can be accomplished, for 
instance, by removing the cells from their growth container, centrifuging the cell 
suspension, removing the supernatant and performing one or more wash steps to 
10 extensively dilute the remaining medium which may contain labeled fatty acid. 
Detection of the labeled fatty acid can be by a means appropriate to the label used. 
For example, for a radioactive label, detection can be by scintillation counting of 
appropriately prepared samples of cells (e.g., lysates or protein extracts); for a 
fluorescent label, by measuring fluorescence in the cells by appropriate 
1 5 instrumentation. 

If a compound tested as a candidate inhibitor of transport function causes the 
test cells to have less labeled fatty acid detected in the cells than that detected in the 
control cells, then the compound is an inhibitor of the fatty acid transport protein. 
Procedures analogous to those above can be devised for identifying enhancers 
20 (agonists of FATPs) of fatty acid transport function wherein if the test cells contain 
more labeled fatty acid than that detected in the control cells, or if the fatty acid is 
taken up at a higher rate, then the compound being tested can be concluded to be an 
enhancer of the fatty acid transport protein. 

Example 13 describes use of an assay of this type to identify an inhibitor of a 
25 FATP. In Example 13, an antisense oligonucleotide which specifically inhibits 

biosynthesis of mmFATP4 was demonstrated to inhibit fatty acid uptake into mouse 
enterocytes. Similarly, antisense oligonucleotides directed towards specifically 
inhibiting the biosynthesis of FATP6 in heart cells, FATPS in liver cells, FATP3 in 
lung cells, and FATP2 in colon cells, can be demonstrated as examples of "test 
30 agents" that inhibit fatty acid transport. 
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Another assay to determine whether an agent is an inhibitor (or enhancer) of 
fatty acid transport employs animals, one or more of which are administered the 
agent, and one or more of which are maintained under similar conditions, but are not 
administered the agent. Both groups of animals are given fatty acids (e.g., orally, 
5 intravenously, by tube inserted into stomach or intestine), and the fatty acids taken up 
into a bodily fluid (e.g., serum) or into an organ or tissue of interest are measured 
from comparable samples taken from each group of animals. The fatty acids may 
carry a label (e.g., radioactive) to facilitate detection and quantitation of fatty acids 
taken up into the fluid or tissue being sampled. This type of assay can be used alone 

10 or can be used in addition to in vitro assays of a candidate inhibitor or enhancer. 

An agent determined to be an inhibitor (or enhancer) of FATP 
function, such as fatty acid binding and/or fatty acid uptake, can be administered to 
cells in culture, or in vivo, to a mammal (e.g. human) to inhibit (or enhance) FATP 
function. Such an agent may be one that acts directly on the FATP (for example, by 

1 5 binding) or can act on an intermediate in a biosynthetic pathway to produce FATP, 
such as transcription of the FATP gene, processing of the mRNA, or translation of the 
mRNA. An example of such an agent is antisense oligonucleotide. 

Antisense methods similar to those illustrated in Example 13 can be used to 
determine the target FATP of a compound or agent that has an inhibitory or 

20 enhancing effect on fatty acid uptake. For example, antisense oligonucleotide 

directed to the inhibition of FATP4 biosynthesis can be added to lung cells or cell 
lines derived from lung cells. In addition, antisense oligonucleotides directed to the 
inhibition of other FATPs, except for FATP3, can also be added to the lung cells. 
The administration of antisense oligonucleotides in this manner ensures that the 

25 predominant FATP activity remaining in the cells comes from FATP3. After a period 
of incubation of the cells with the antisense oligonucleotides sufficient to deplete the 
plasma membrane of the FATPs whose biosynthesis has been inhibited, a test agent, 
preferably one that has been shown by some preliminary test to have an inhibitory or 
enhancing activity on fatty acid transport, can be added to the lung cells. If the test 

30 agent is now demonstrated, after treatment of the cells with antisense 

oligonucleotides, to have an inhibitory or enhancing activity on fatty acid transport in 
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the lung cells, it can be concluded that the target of the test agent is FATP3, or a 
molecule involved in the biosynthesis or activity of FATP3. 

Li another type of cell-based assay for uptake of fatty acids, a change of 
intracellular pH resulting from the uptake of fatty acids can be followed by an 
5 indicator fluorophore. The fluorophore can be taken up by the cells in a 

preincubation step. Fatty acids can be added to the cell medium, and after some 
period of incubation to allow FATP-mediated uptake of fatty acids, the change in X ma> 
of fluorescence can be measured, as an indicator of a change in intracellular pH, as 
the X max of fluorescence of the fluorophore changes with the pH of its environment, 
10 thereby indicating uptake of fatty acids. One such fluorophore is BCECF (2', 7'- 
bis(2-carboxyethyl)-5(6)- carboxyfluorescein; Rink, TJ. et al f J. Cell. Biol. 95: 189 
(1982)). 

In assays similar to those described above, a candidate inhibitor or enhancer 
of fatty acid transport function can be added (or mock-added, for control cultures) to 

15 cultures of cells engineered to express a desired FATP to which fatty acid substrate is 
also added. Inhibition of fatty acid uptake is indicated by a lack of the drop in pH, 
indicating fatty acid uptake, that is seen in control cells. Enhancement of fatty acid 
uptake is indicated by a decrease in intracellular pH, as compared to control cells not 
receiving the candidate enhancer of fatty acid transport function. 

20 Yeast cells can be used in a similar cell-based assay for the uptake of fatty 

acids mediated by a FATP, and such an assay can be adapted to a screening assay for 
the identification of agents that inhibit or enhance fatty acid uptake by an FATP. 
Yeast cells lacking an endogenous FATP activity (mutated, disrupted or deleted for 
FAT1\ Faergeman, NJ. et al.,J. Biol Chem. 272(13):8531-8538 (1997); Watkins, 

25 P.A.etal.,J.Biol. Chem. 273(29):1 8210-18219 (1998)) can be engineered to harbor 
a related gene of the family of FATP-encoding genes, such as a mammalian FATP 
(e.g., human FATP4). 

Examples of expression vectors include pEG (Mitchell, D.A., et al, Yeast 
9:715-723 (1993)) and pDADl and pDAD2, which contain a GAL1 promoter (Davis, 

30 L. I. and Fink, G. R., Cell 57:965-978 (1990)). A variety of promoters are suitable 
for expression. Available yeast vectors offer a choice of promoters. In one 
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embodiment, the inducible GAL1 promoter is used. In another embodiment, the 
constitutive ADH1 promoter (alcohol dehydrogenase; Bennetzen, J. L. and Hall, B. 
D., 7. Biol. Chem. 257. 3026-303 1 (1982)) can be used to express an inserted gene on 
glucose-containing media. An example of a vector suitable for expression of a 
5 heterologous FATP gene in yeast is pQB 169. 

With the introduced FATP gene providing the only fatty acid transport protein 
function for the yeast cells, it is possible to study effect of the heterologous FATP on 
fatty acid transport into the yeast cells in isolation. Assays for the uptake of fatty 
acids into the yeast cells can be devised that are similar to those described above 

10 and/or those assays that have been illustrated in the Examples. Tests for candidate 
inhibitors or enhancers of the heterologous FATP can be done in cultures of yeast 
cells, wherein the yeast cells are incubated with fatty acid substrate and an agent to be 
tested as an inhibitor or enhancer of FATP function. FATP uptake after a period of 
time can be measured by analyzing the contents of the yeast cells for fatty acid 

15 substrate, as compared with control yeast cells incubated with the fatty acid, but not 
with the test agent. Yeast cells have the additional advantage, over mammalian cells 
in culture, for example, that yeast cells can be forced to rely upon fatty acids as their 
only source of carbon, if the growth medium supplied to the yeast cells is formulated 
to contain no other source of carbon. Thus, the effect of the heterologous FATP on 

20 fatty acid uptake and metabolism in the engineered yeast cells can be amplified. An 
agent that efficiently blocks transport function of the heterologous FATP could result 
in death of the yeast cells. Thus, in this case, inhibition of function of the 
heterologous FATP can result in loss of viability. A simple measure of viability is 
turbidity of the yeast suspension culture, which can be adapted to a high throughput 

25 screening assay for effects of various agents to be tested, using microtiter plates or 
similar devices for small-volume cultures of the engineered yeast cells. 

Cell-free assays can also be used to measure the transport of fatty acids across 
a membrane, and therefor also to assess a test treatment or test agent for its effect on 
the rate or extent of fatty acid transport. An isolated FATP, for example in the 

30 presence of a detergent that preserves the native 3-dimensional structure of the FATP, 
or partially purified FATP, can be used in an artificial membrane system typically 
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used to preserve the native conformation and activity of membrane proteins. Such 
systems include liposomes, artificial bilayers of phospholipids, isolated plasma 
membrane such as cell membrane fragments, cell membrane fractions, or cell 
membrane vesicles, and other systems in which the FATP can be properly oriented 
5 within the membrane to have transport activity. Assays for transport activity can be 
performed using methods analogous to those that can be used in cells engineered to 
predominantly express one FATP whose function is to be measured. A labeled (e.g., 
radioactively labeled) fatty acid substrate can be incubated with one side of a bilayer 
or in a suspension of liposomes constructed to integrate a properly oriented FATP. 
10 The accumulation of fatty acids with time can be measured, using appropriate means 
to detect the label (e.g., scintillation counting of medium on each side of the bilayer, 
or of the contents of liposomes isolated from the surrounding medium). Assays such 
as these can be adapted to use for the testing of agents which might interact with the 
FATP to produce an inhibitory or an enhancing effect on the rate or extent of fatty 
15 acid transport. That is, the above-described assay can be done in the presence or 
absence of the agent to be tested, and the results compared. 

For examples of isolation of membrane proteins (ADP/ATP carrier and 
uncoupling protein); reconstitution into phospholipid vesicles, and assays of 
transport, see Klingenberg, M. et al., Methods EnzymoL 260:369-389 (1995). For an 
20 example of a membrane protein (phosphate carrier of Saccharomyces cerevisiae) that 
was purified and solubilized from E. coli inclusion bodies, see Schroer, A. et al., J. 
Biol. Chem. 273: 14269-14276(1998). The Glut 1 glucose transporter of rat has been 
expressed in yeast. A crude membrane fraction of the yeast was prepared and 
reconstituted with soybean phospholipids into liposomes. Glucose transport activity 
25 could be measured in the liposomes (Kasahara, T. and Kasahara, M., J. Biol. Chem. 
273: 29113-291 17 (1998)). Similar methods can be applied to the proteins and 
polypeptides of the invention. 

Another embodiment of the invention is a method for inhibiting fatty acid 
uptake in a mammal (e.g., a human), comprising administering to the mammal a 
30 therapeutically effective amount of an inhibitor of the transport function of one or 
more of the fatty acid transport proteins, thereby decreasing fatty acid uptake by cells 
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comprising the fatty acid protein(s). Where it is desirable to reduce the uptake of 
fatty acids, for example, in the treatment of chronic obesity or as a part of a program 
of weight control or hyperlipidemia control in a human, one or more inhibitors of one 
or more of the fatty acid transport proteins can be administered in an effective dose, 
5 and by an effective route, for example, orally, or by an indwelling device that can 
deliver doses to the small intestine. The inhibitor can be one identified by methods 
described herein, or can be one that is, for instance, structurally related to an inhibitor 
identified by methods described herein (e.g., having chemical adducts to better 
stabilize or solubilize the inhibitor). The invention further relates to compositions 
10 comprising inhibitors of fatty acid uptake in a mammal, which may further comprise 
pharmaceutical carriers suitable for administration to a subject mammal, such as 
sterile solubilizing or emulsifying agents. 

A further embodiment of the present invention is a method of enhancing or 
increasing fatty acid uptake, such as enhancing or increasing LCFA uptake in the 
15 small intestine (e.g., to treat or prevent a malabsorption syndrome or other wasting 
condition) or in the liver (e.g., by an enhancer of FATP5 transport activity to treat 
acute liver failure) or in the kidney (e.g., by an enhancer of FATP2 transport activity 
to treat kidney failure). In this embodiment, a therapeutically effective amount of an 
enhancer of the transport function of one or more of the fatty acid transport proteins 
20 can be administered to a mammalian subject, with the result that fatty acid uptake in 
the small intestine is enhanced. In this embodiment, one or more enhancers of one or 
more of fatty acid transport proteins is administered in an effective dose and by a 
route (e.g., orally or by a device, such as an indwelling catheter or other device) 
which can deliver doses to the gut. The enhancer of FATP function (e.g., an enhancer 
25 of FATP4 function) can be identified by methods described herein or can be one that 
is structurally similar to an enhancer identified by methods described herein. 

Aerobic reperfusion of ischemic myocardium is a common clinical event 
which can occur during such treatments as cardiac surgery, angioplasty, and 
thrombolytic therapy after a myocardial infarction. During reperfusion, a rapid 
30 recovery of myocardial energy production is essential for the complete recovery of 
contractile function. Not only the extent of recovery of myocardial energy 
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metabolism but also the type of energy substrate used by the heart during reperfusion 
are important determinants of functional recovery. Circulating fatty acid levels 
increase following acute myocardial infarction or during cardiac surgery, such that 
during and following ischemia the heart muscle can be exposed to very high 
5 concentrations of fatty acids (Lopaschuk, G.D. and W. C. Stanley, Science and 
Medicine (November/December 1997)). High plasma fatty acid concentrations 
increase the severity of ischemic damage in a number of experimental models of 
cardiac ischemia and have been linked to depression of mechanical function during 
aerobic reperfusion of previously ischemic hearts. Further data show that modifying 

1 0 fatty acid utilization can be beneficial for heart function in ischemia and can be a 
useful approach for the treatment of angina. See, e.g., Desideri and Celegon, Am. J. 
Cardiol 82(5A):50K-53K; Lopaschuk, Am. J. Cardiol 82(5A)\\4¥L-\1^ Plasma 
fatty acid concentrations can be reduced by administering to a human subject or other 
mammal an effective amount of an inhibitor of a FATP such as FATP2 or FATP4, 

1 5 thereby providing a way of reducing fatty acid utilization by the heart. 

In a further embodiment of the invention, a therapeutically effective amount 
of an inhibitor of hsFATP6 can be administered to a human patient by a suitable 
route, to reduce the uptake of fatty acids by cardiac muscle. This treatment is 
desirable in patients who are diagnosed as having, or who are at risk of, abnormal 

20 accumulations of fatty acids in the heart or a detrimentally high rate of uptake of fatty 
acids into the heart, because of ischemic heart disease, or following ischemia or 
trauma to the heart. 

The invention further relates to antibodies that bind to an isolated or 
recombinant fatty acid transport protein of the FATP family, including portions of 

25 antibodies, which can specifically recognize and bind to one or more FATPs. The 
antibodies and portions thereof of the invention include those which bind to one or 
more FATPs of mouse or other mammalian species. In a preferred embodiment, the 
antibodies specifically bind to a naturally occurring FATP of humans. The antibodies 
can be used in methods to detect or to purify a protein of the present invention or a 

30 portion thereof by various methods of immunoaffinity chromatography, to inhibit the 
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function of a protein in a method of therapy, or to selectively inactivate an active site, 
or to study other aspects of the structure of these proteins, for example. 

The antibodies of the present invention can be polyclonal or monoclonal. The 
term antibody is intended to encompass both polyclonal and monoclonal antibodies. 
5 Antibodies of the present invention can be raised against an appropriate immunogen, 
including proteins or polypeptides of the present invention, such as an isolated or 
recombinant FATP1, FATP2, FATP3, FATP4, FATP5, FATP6, mtFATP, ceFATPa, 
ceFATPb, scFATP or portions thereof, or synthetic molecules, such as synthetic 
peptides (e.g., conjugated to a suitable carrier). Preferred embodiments are antibodies 
10 that bind to any of the following: hsFATPl , hsFATP2, hsFATP3, hsFATP4, 

hsFATPS or hsFATP6. The immunogen can be a polypeptide comprising a portion 
of a FATP and having at least one function of a fatty acid transport protein, as 
described herein. 

The term antibody is also intended to encompass single chain antibodies, 

15 chimeric, humanized or primatized (CDR-grafted) antibodies and the like, as well as 
chimeric or CDR-grafted single chain antibodies, comprising portions from more 
than one species. For example, the chimeric antibodies can comprise portions of 
proteins derived from two different species, joined together chemically by 
conventional techniques or prepared as a single contiguous protein using genetic 

20 engineering techniques (e.g., DNA encoding the protein portions of the chimeric 

antibody can be expressed to produce a contiguous protein chain. See, e.g., Cabilly et 
al., U.S. Patent No. 4,816,567; Cabilly et aL, European Patent No. 0,125,023 Bl; 
Boss et aL, U.S. Patent No. 4,816,397; Boss et aL, European Patent No. 0,120,694 
Bl; Neuberger, M.S. et aL, WO 86/01533; Neuberger, M.S. et aL, European Patent 

25 No. 0,194,276 Bl; Winter, U.S. Patent No. 5,225,539; Winter, European Patent No. 
0,239,400 Bl; Queen et aL, U.S. Patent No. 5,585,089; and Queen et aL, European 
Patent No. EP 0 451 216 Bl. See also, Newman, R. et aL, BioTechnology, 70:1455- 
1460 (1992), regarding primatized antibody, and Ladner et aL, U.S. Patent No. 
4,946,778 and Bird, R.E. et aL, Science, 242:423-426 (1988) regarding single chain 

30 antibodies.) 
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Whole antibodies and biologically functional fragments thereof are also 
encompassed by the term antibody. Biologically functional antibody fragments 
which can be used include those fragments sufficient for binding of the antibody 
fragment to a FATP to occur, such as Fv, Fab, Fab 1 and F(ab') 2 fragments. Such 
5 fragments can be produced by enzymatic cleavage or by recombinant techniques. For 
instance, papain or pepsin cleavage can generate Fab or F(ab ? ) 2 fragments, 
respectively. Antibodies can also be produced in a variety of truncated forms using 
antibody genes in which one or more stop codons have been introduced upstream of 
the natural stop site. For example, a chimeric gene encoding a F(ab') 2 heavy chain 

10 portion can be designed to include DNA sequences encoding the CH X domain and 
hinge region of the heavy chain. 

Preparation of immunizing antigen (whole cells comprising FATP on the cell 
surface or purified FATP), and polyclonal and monoclonal antibody production can 
be performed using any suitable technique. A variety of methods have been 

15 described (See e.g., Kohler et al t Nature, 256: 495-497 (1975) and Eur. J. Immunol. 
6: 511-519 (1976); Milstein et al, Nature 266: 550-552 (1977); Koprowski et aL, 
U.S. Patent No. 4,172,124; Harlow, E. and D. Lane, 1988, Antibodies: A Laboratory 
Manual, (Cold Spring Harbor Laboratory: Cold Spring Harbor, NY); Chapter 1 1 In 
Current Protocols In Molecular Biology, Vol. 2 (containing supplements up through 

20 Supplement 42, 1998), Ausubel, F.M. et al., eds., (John Wiley & Sons: New York, 
NY)). Generally, a hybridoma can be produced by fusing a suitable immortal cell 
line (e.g., a myeloma cell line such as SP2/0) with antibody producing cells. The 
antibody producing cells, preferably those obtained from the spleen or lymph nodes, 
can be obtained from animals immunized with the antigen of interest. Immunization 

25 of animals can be by introduction of whole cells comprising fatty acid transport 
protein on the cell surface. The fused cells (hybridomas) can be isolated using 
selective culture conditions, and cloned by limiting dilution. Cells which produce 
antibodies with the desired specificity can be selected by a suitable assay (e.g., 
ELISA). 

30 Other suitable methods of producing or isolating antibodies (including human 

antibodies) of the requisite specificity can used, including, for example, methods 
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which select recombinant antibody from a library (e.g., Hoogenboom et al. 9 
WO 93/06213; Hoogenboom etal, U.S. Patent No. 5,565,332; WO 94/13804, 
published June 23, 1994; and Dower, W.J. et al. y U.S. Patent No. 5,427,908), or 
which rely upon immunization of transgenic animals (e.g., mice) capable of 
5 producing a full repertoire of human antibodies (see e.g., Jakobovits et al, Proc. 
Natl. Acad. Sci. USA, 90: 2551-2555 (1993); Jakobovits etal, Nature, 362:255-258 
(1993); Lonberg et al, U.S. Patent No. 5,569,825; Lonberg et al, U.S. Patent No. 
5,545,806; Surani et aL, U.S. Patent No. 5^545,807; and Kucherlapati, R. et al 9 
European Patent No. EP 0 463 1 5 1 B 1). 
10 Another aspect of the invention is a method for directing an agent to cardiac 

muscle. The differential expression of FATP6 in cardiac muscle but not in other 
tissue types allows for the specific targeting of drugs, diagnostic agents, tagging 
labels, histological stains or other substances specifically to cardiac muscle. A 
targeting vehicle can be used for the delivery of such a substance. Targeting vehicles 
15 which bind specifically to FATP6 can be linked to a substance to be delivered to the 
cells of cardiac muscle. The linkage can be, for instance, via one or more covalent 
bonds, or by high affinity non-covalent bonds. A targeting vehicle can be an 
antibody, for instance, or other compound (e.g., a fatty acid or fatty acid analog) 
which binds to FATP6 with high specificity. 
20 Targeting vehicles specific to the heart-specific protein FATP6 have in vivo 

(e.g., therapeutic and diagnostic) applications. For example, an antibody which 
specifically binds to FATP6 can be conjugated to a drug to be targeted to the heart 
(e.g., a cardiac glycoside to treat congestive heart failure, or P-adrenergic agents, 
sodium channel blockers or calcium channel blockers to treat arrhythmias). A 
25 substance (e.g., a radioactive substance) which can be detected (e.g., a label) in vivo 
can also be linked to a targeting vehicle which specifically binds to a heart-specific 
protein such as FATP6, and the conjugate can be used as a labeling agent to identify 
cardiac muscle cells. 

Targeting vehicles specific to FATP6 find further applications in vitro. For 
30 example, an FATP6-specific targeting vehicle, such as an antibody (a polyclonal 
preparation or monoclonal) which specifically binds to FATP6, can be linked to a 
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substance which can be used as a stain for a tissue sample (e.g., horseradish 
peroxidase) to provide a method for the identification of cardiac muscle in a sample, 
as can be used in embryology studies, for example. 

In a similar manner, an agent can be directed to the liver of a mammal, as 
5 FATP5 is expressed in liver but not in other tissue types. A targeting vehicle which 
specifically binds to FATP5 can be conjugated to a drug for delivery of the. drug to 
the liver, such as a drug to treat hepatitis, Wilson's disease, lipid storage diseases and 
liver cancer. As with targeting vehicles specific to FATP6, targeting vehicles specific 
to FATP5 can be used in studying tissue samples in vitro. 
10 The invention also relates to compositions comprising a modulator of FATP 

function. The term "modulate" as used herein refers to the ability of a molecule to 
alter the function of another molecule. Thus, modulate could mean, for example, 
inhibit, antagonize, agonize, upregulate, downregulate, induce, or suppress. A 
modulator has the capability of altering function of its target. Such alteration can be 
15 accomplished at any stage of the transcription, translation, expression or function of 
the protein, so that, for example, modulation of a target gene can be accomplished by 
modulation of the DNA or RNA encoding the protein, and the protein itself 
Antagonists or agonists (inhibitors or enhancers) of the FATPs of the 
invention, antibodies that bind a FATP, or mimetics of a FATP can be employed in 
20 combination with a non-sterile or sterile carrier or carriers for use with cells, tissues 
or organisms, such as a pharmaceutical carrier suitable for administration to a 
mammalian subject. Such compositions comprise, for instance, a media additive or a 
therapeutically effective amount of an inhibitor or enhancer compound to be 
identified by an assay of the invention and a pharmaceutically acceptable carrier or 
25 excipient. Such carriers may include, but are not limited to, saline, buffered saline, 
dextrose, water, ethanol, surfactants, such as glycerol, excipients such as lactose and 
combinations thereof. The formulation can be chosen by one of ordinary skill in the 
art to suit the mode of administration. The chosen route of administration will be 
influenced by the predominant tissue or organ location of the FATP whose function is 
30 to be inhibited or enhanced. For example, for affecting the function of FATP4, a 

preferred administration can be oral or through a tube inserted into the stomach (e.g., 
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direct stomach tube or nasopharyngeal tube), or through other means to accomplish 
delivery to the small intestine. The invention further relates to diagnostic and 
pharmaceutical packs and kits comprising one or more containers filled with one or 
more of the ingredients of the aforementioned compositions of the invention. 
5 Compounds of the invention which are FATPs, FATP fusion proteins, FATP 

mimetics, FATP gene-specific antisense poly- or oligonucleotides, inhibitors or 
enhancers of a FATP may be employed alone or in conjunction with other 
compounds, such as therapeutic compounds. The pharmaceutical compositions may 
be administered in any effective, convenient manner, including administration by 

10 topical, oral, anal, vaginal, intravenous, intraperitoneal, intramuscular, subcutaneous, 
intranasal, transdermal or intradermal routes, among others. In therapy or as a 
prophylactic, the active agent may be administered to an individual as an injectable 
composition, for example as a sterile aqueous dispersion, preferably isotonic. 

' Alternatively, the composition may be formulated for topical application, for 

15 example, in the form of ointments, creams, lotions, eye ointments, eye drops, ear 

drops, mouthwash, impregnated dressings and sutures and aerosols, and may contain 
appropriate conventional additives, including, for example, preservatives, solvents to 
assist drug penetration, and emollients in ointments and creams. Such topical 
formulations may also contain compatible conventional carriers, for example cream 

20 or ointment bases, and ethanol or oleyl alcohol for lotions. 

In addition, the amount of the compound will vary depending on the size, age, 
body weight, general health, sex, and diet of the host, and the time of administration, 
the biological half-life of the compound, and the particular characteristics and 
symptoms of the disorder to be treated. Adjustment and manipulation of established 

25 dose ranges are well within the ability of those of skill in the art. 

A further aspect of the invention is a method to identify a polymorphism, or 
the presence of an alternative or variant allele of a gene in the genome of an organism 
(of interest here, genes encoding FATPs). As used herein, polymorphism refers to 
the occurrence of two or more genetically determined alternative sequences or alleles 

30 in a population. A polymorphic locus may be as small as a base pair. Polymorphic 
markers include restriction fragment length polymorphisms, variable number of 
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tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, 
trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion 
elements such as Alu. The first identified alleleic form, or the most frequently 
occurring form can be arbitrarily designated as the reference (usually, "wildtype") 
5 form, and other allelic forms are designated as alternative (sometimes, "mutant" or 
"variant"). Dipolid organisms may be homozygous or heterozygous for allelic forms. 

An "allele" or "allelic sequence" is an alternative form of a gene which may 
result from at least one mutation in the nucleotide sequence. Alleles may result in 
altered mRNAs or polypeptides whose structure or function may or may not be 
10 altered. Any given gene may have none, one, or many allelic forms (polymorphism). 
Common mutational changes which give rise to alleles are generally ascribed to 
natural deletions, additions, or substitutions of nucleotides. Each of these types of 
changes may occur alone, or in combination with the others, one or more times in a 
given sequence. 

15 Several different types of polymorphisms have been reported. A restriction 

fragment length polymorphism (RFLP) is a variation in DNA sequence that alters the 
length of a restriction fragment (Botstein et al., Am. J. Hum. Genet 52:314-331 
(1980)). The restriction fragment length polymorphism may create or delete a 
restriction site, thus changing the length of the restriction fragment. RFLPs have 

20 been widely used in human and animal genetic analyses (see WO 90/13668; WO 
90/1 1369; Donis-Keller, Cell J7:319-337 (1987); Lander et al., Genetics 727:85-99 
(1989)). When a heritable trait can be linked to a particular RFLP, the presence of 
the RFLP in an individual can be used to predict the likelihood that the individual 
will also exhibit the trait. 

25 Other polymorphisms take the form of short tandem repeats (STRs) that 

include tandem di-, tri- and tetra-nucleotide repeated motifs. These tandem repeats 
are also referred to as variable number tandem repeat (VNTR) polymorphisms. 
VNTRs have been used in identity and paternity analysis (US 5,075,217; Armour et 
aL, FEBSLett. 307:113-115 (1992); Horn et aL, WO 91/14003; Jeffreys, EP 

30 370,719), and in a large number of genetic mapping studies. 
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Other polymorphisms take the form of single nucleotide variations between 
individuals of the same species. Such polymorphisms are far more frequent than 
RFLPs, STRs (short tandem repeats) and VNTRs (variable number tandem repeats). 
Some single nucleotide polymorphisms occur in protein-coding sequences, in which 
5 case, one of the polymorphic forms may give rise to the expression of a defective or 
other variant protein and, potentially, a genetic disease. Other single nucleotide 
polymorphisms occur in noncoding regions. Some of these polymorphisms may also 
result in defective protein expression (e.g., as a result of defective splicing). Other 
single nucleotide polymorphisms have no phenotypic effects. 

1 0 Many of the methods described below require amplification of DNA from 

target samples and purification of the amplified products. This can be accomplished 
by PCR, for instance. See generally, PCR Technology, Principles and Applications 
for DNA Amplification (ed. H.A. Erlich), Freeman Press, New York, NY, 1992; PCR 
Protocols: A Guide to Methods and Applications (eds. Innis, et al.), Academic Press, 

15 San Diego, CA, 1990; Mattila et al, Nucleic Acids Res. 19:4967 (1991); Eckert et al., 
PCR Methods and Applications 7:17 (1991); PCR (eds. McPherson et al, IRS Press, 
Oxford); and US 4,683,202. 

Other suitable amplification methods include the ligase chain reaction (LCR) 
(see Wu and Wallace, Genomics 4:560 (1989); Landegren et al, Science 241:1017 

20 (1988)), transcription amplification (Kwoh et al., Proc. NatL Acad. Sci. USA 

55:1173 (1989), self-sustained sequence replication (Guatelli et al., Proc. Natl. Acad. 
Sci. USA 57:1874 (1990), and nucleic acid based sequence amplification (NASBA). 
The latter two amplification methods involve isothermal reactions based on 
isothermal transcription, which produce both single stranded RNA (ssRNA) and 

25 double stranded DNA (dsDNA) as the amplification products in a ratio of about 30 or 
100 to 1, respectively. 

Another aspect of the invention is a method for detecting a variant allele of a 
human FATP gene, comprising preparing amplified, purified FATP DNA from a 
reference human and amplified, purified, FATP DNA from a "test* 1 human to be 

30 compared to the reference as having a variant allele, using the same or comparable 
amplification procedures, and determining whether the reference DNA and test DNA 
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differ in DNA sequence in the FATP gene, whether in a coding or a noncoding 
region, wherein, if the test DNA differs in sequence from the reference DNA, the test 
DNA comprises a variant allele of a human FATP gene. The following is a 
discussion of some of the methods by which it can be determined whether the 
5 reference FATP DNA and test FATP DNA differ in sequence. 

Direct Sequencing. The direct analysis of the sequence of variant alleles of 
the present invention can be accomplished using either the dideoxy chain termination 
method or the Maxam and Gilbert method (see Sambrook et al., Molecular Cloning: 
A 

10 Laboratory Manual, 2nd ed., Cold Spring Harbor Press, New York 1989; Zyskind et 

aL, Recombinant DNA Laboratory Manual, Acad. Press, 1988)). 

Denaturing Gradient Gel Electrophoresis. Amplification products generated 

using the polymerase chain reaction can be analyzed by the use of denaturing gradient 

gel eletrophoresis. Different alleles can be identified based on the different sequence- 
15 dependent strand dissociation properties and electrophoretic migration of DNA in 

solution (chapter 7 in Erlich, ed. PCR Technology, Principles and Applications for 

DNA Amplification, W.H. Freeman and Co., New York, 1992). 

Single-strand Conformation Polymorphism Analysis. Alleles of target 

sequences can be differentiated using single-strand conformation polymorphism 
20 analysis, which identifies base differences by alteration in electrophoretic migration 

of single stranded PCR products, as described in Orita et aL, Proc. Natl Acad. Sci. 

USA 56:2766-2770 (1989). Amplified PCR products can be generated as described 

above, and heated or otherwise denatured, to form single-stranded amplification 

products. Single-stranded nucleic acids may refold or form secondary structures 
25 which are partially dependent on the base sequence. The different electrophoretic 

mobilities of single-stranded amplification products can be related to base-sequence 

differences between alleles of target sequences. 

Detection of Binding by Protein That Binds to Mismatches. Amplified DNA 

comprising the FATP gene or portion of the gene of interest from genomic DNA, for 
30 example, of a normal individual is prepared, using primers designed on the basis of 

the DNA sequences provided herein. Amplified DNA is also prepared, in a similar 
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manner, from genomic DNA of an individual to be tested for bearing a 
distinguishable allele. The primers used in PCR carry different labels, for example, 
primer 1 with biotin, and primer 2 with 32 P. Unused primers are separated from the 
PCR products, and the products are quantitated. The heteroduplexes are used in a 
5 mismatch detection assay using immobilized mismatch binding protein (MutS) bound 
to nitrocellulose. The presence of biotin-labeled DNA wherein mismatched regions 
are bound to the nitrocellulose via MutS protein, is detected by visualizing the 
binding of streptavidin to biotin. See WO 95/12689. MutS protein has also been 
used in the detection of point mutations in a gel-mobility-shift assay (Lishanski, A. et 

10 al t Proc. Natl Acad. ScL USA 97:2674-2678 (1994)). 

Other methods, such as those described below, can be used to distinguish a 
FATP allele from a reference allele, once a particular allele has been characterized as 
to DNA sequence. 

Allele-specific probes. The design and use of allele-specific probes for 

15 analyzing polymorphims is described by e.g., Saiki et al., Nature 324:163-166 

(1986); Dattagupta, EP 235,726, Saiki, WO 89/11548. Allele-specific probes can be 
designed so that they hybridize to a segment of a target DNA from one individual but 
do not hybridize to the corresponding segment from another individual due to the 
presence of different polymorphic forms in the respective segments from the two 

20 individuals. Hybridization conditions should be sufficiently stringent that there is a 
significant difference in hybridization intensity between alleles, and preferably an 
essentially binary response, whereby a probe hybridizes to only one of the alleles. 
Some probes are designed to hybridize to a segment of target DNA such that the 
polymorphic site aligns with a central position (e.g., in a 15-mer at the 7 position; in a 

25 16-mer, at either the 8 or 9 position) of the probe. This design of probe achieves 
good discrimination in hybridization between different allelic forms. 

Allele-specific probes are often used in pairs, one member of a pair showing a 
perfect match to a reference form of a target sequence and the other member showing 
a perfect match to a variant form. Several pairs of probes can then be immobilized on 

30 the same support for simultaneous analysis of multiple polymorphisms within the 
same target sequence. 
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Allele-specific Primers. An allele-specific primer hybridizes to a site on 
target DNA overlapping a polymorphism, and only primes amplification of an allelic 
form to which the primer exhibits perfect complementarity. See Gibbs, Nucleic Acid 
Res. 77:2427-2448 (1989). This primer is used in conjunction with a second primer 
5 which hybridizes at a distal site. Amplification proceeds from the two primers, 

resulting in a detectable product which indicates the particular allelic form is present. 
A control is usually performed with a second pair of primers, one of which shows a 
single base mismatch at the polymorphic site and the other of which exhibits perfect 
complementarity to a distal site. The single-base mismatch prevents amplification 
10 and no detectable product is formed. The method works best when the mismatch is 
included in the 3'-most position of the oligonucleotide aligned with the polymorphism 
because this position is most destabilizing to elongation from the primer (see, e.g., 
WO 93/22456). 

Gene Chips. Allelic variants can also be identified by hybridization to nucleic 
15 acids immobilized on solid supports (gene chips), as described, for example, in WO 
95/1 1995 and U.S. Patent No. 5,143,854, both of which are incorporated herein by 
reference. WO 95/1 1995 describes subarrays that are optimized for detection of a 
characterized variant allele. Such a subarray contains probes designed to be 
complementary to a second reference sequence, which is an allelic variant of the first 
20 reference sequence. 

The present method is illustrated by the following examples, which are not 
intended to be limiting in any way. 

EXAMPLES 
Materials and Methods 
25 The following Materials and Methods were used in the work described in 

Examples 1-5. 

Sequence Alignment of FATP Clones. The DNA sequence for mouse FATP1 
was obtained from the National Center for Biotechnology Information nonredundant 
database. cDNAs for mmFATP2, 3, 4, and 5 were obtained by screening mouse 
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expression libraries (purchased from GIBCO/BRL, Rockville, MD) with probes 
derived from the cloned expressed sequence tags (ESTs) (Research Genetics, 
Huntsville, AL). Full-length clones were obtained for mmFATP2 and 5 and partial 
sequences for mmFATP3 and 4. The sequences described herein have been 
5 deposited in the GenBank database (Accession Nos. FATP2, AF072760; FATP3, 
AF072759; FATP4, AF072758; FATP5, AF072757). 

Neither FATP2 nor FATP5 contains an in-frame stop codon upstream of the 
putative initiator methionine; initiator methionines were assigned by homology with 
that in mmFATPl and by the presence of a signal sequence immediately after it. The 

10 Mycobacterium tuberculosis, Caenorhabditis elegans, and Saccharomyces cerevisiae 
sequences were present in the dbEST database as part of the sequencing projects for 
these organisms. Sequences were aligned utilizing a ClustalX algorithm and the 
resulting alignment exported to SeqVu. Homologous amino acid substitutions are 
boxed in Figured and were determined using the Dayhoff 250 method with a 50% 

1 5 homology cutoff. 

Cell Transfection and LCFA Uptake. COS cells were cotransfected using the 
DEAE-dextran method with the mammalian expression vector pCDNA 3.1 
(Invitrogen, Carlsbad, CA) expressing the gene for CD2 (pCDNA-CD2) in 
combination with either a pCDNA 3.1 or pCMVSPORT2 (GIBCO/BRL, Rockville, 

20 MD) expression vector containing one of the murine or nematode FATP genes 

(pCDNA-mmFATPl , pCDNA-FATP2, pCMVSPORT-FATP5, p CDNA -ceFA TPb) . 
Two days after transfection, cells were assayed for CD2 expression with a 
phycoerythrin-coupled anti-CD2(PE-CD2) monoclonal antibody (PharMingen, 
Franklin Lakes, NJ), and fatty acid uptake was assayed with a BODIPY-labeled fatty 

25 acid analogue (Molecular Probes). Briefly, cells were washed twice with PBS 
(phosphate buffered saline) and stained with PE-CD2 at 4°C for 30 min in PBS 
containing 1 0% fetal calf serum. They were then washed three times with PBS/fetal 
calf serum for 5 min followed by an incubation for 2 min at 37°C in fatty acid uptake 
solution, which contained 0.1 ^M BODIPY-FA and 0.1% fatty acid-free BSA 

30 (bovine serum albumin) in PBS (Schaffer, J.E. & Lodish, H.F. (1994) Cell 79:427- 
436). After 2 min, the cells were washed four times with ice-cold PBS/0.1% BSA. 
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The cells were then removed from the plates with PBS containing 5 mM EDTA and 
resuspended in PBS containing 10% fetal calf serum and 10 mM EDTA. PE-CD2 
and BODIPY-FA fluorescence were measured using a FACScan (Becton Dickinson, 
Franklin Lakes, NJ). COS cells were gated on forward scatter (FSC) and side scatter 
5 (SS). Cells exhibiting more than 300 CD2 fluorescence units (dsim) representing 
15% of all cells were deemed CD2 positive and their BODIPY-FA fluorescence was 
quantitated. 

E. coli-Based LCFA Uptake Assay. The full-length coding region of mtFATP 
and a control protein, the mammalian transcription factor TFE3, were subcloned into 

10 the inducible, prokaryotic expression vector pET (Novagen, Madison, WI). 

Expression was induced with 1 mM isopropyl P-D-thiogalactoside (IPTG) for 1 hour, 
or cells were left uninduced. Cells were washed in PBS/0.1% BSA and resuspended 
in 1 ml PBS/0.1% BSA containing 0.1 \xM [ 3 H]palmitate (NEN) at 37°C. Uptake 
was stopped after the indicated incubation time by transferring the cells onto filter 

1 5 paper using a cell harvester (Brandel, Bethesda, MD). Filters were washed 

extensively with ice-cold PBS/0.1% BSA, and [ 3 H]palmitate was quantitated by 
scintillation counting. 

Northern Blots. Northern blot analysis of murine FATP expression was 
done using poly(A) mRNA blots (Clontech, Palo Alto, CA). Probes of each of the 

20 FATPs were derived from the 3' untranslated regions of each gene and were <60% 
identical in sequence. Probes were labeled by random priming (Boehringer 
Mannheim, Indianapolis, IN) and hybridized at 65°C. Blots were extensively washed 
in 0.2% SSC/0. 1 % SDS at 65°C. 

Generation of Phylogenetic Trees. Complete and partial sequences for FATP 

25 genes from human, rat, mouse, puffer fish, Drosophila meianogaster, C elegans, S. 
cerevisiae, and M tuberculosis were aligned using ClustalX. A homologous region 
of 48 amino acids (residues 472-519 in mmFATPl) from all of the genes was used to 
determine phylogenetic relationship within ClustalX. Based on these data a 
phylogenetic tree was generated using Tree View PPC (Figure 5). 

30 Nomenclature. It is proposed that the FATP genes be given a species specific 

prefix (mm, Mus musculus\ hs, Homo sapiens; mt, M. tuberculosis; dm, D. 
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melanogaster; ce, C elegans, sc, £ cerevisiae) and numbered such that mammalian 
homologues in different species share the same number but differ in their prefix. 
Since the two C elegans genes cannot be paired with a specific human or mouse 
FATP, they have been designated ceFATPa and ceFATPb. 

5 Example 1: Identification of Novel Mammalian FATPs 

The National Center for Biotechnology Information EST database was 
screened, using the mouse FATP protein sequence (mmFATPl), to identify novel 
FATPs. This strategy led to the identification of more than 50 murine EST sequences 
which could be assembled into five distinct contiguous DNA sequences (contigs). 

10 One contig was identical to the previously cloned FATP, which has been renamed 
FATP1. Another, which has been renamed FATP2, is the murine homologue of a rat 
gene previously identified by others as a very long chain acyl-CoA synthase 
(Uchiyama, A., Aoyama, T., Kamijo, K., Uchida, Y., Kondo, N., Orii, T. & 
Hashimoto, T. (1996)7. Biol Chem. 277:30360-30365). The other three contigs 

15 represented novel genes (FATP3, 4, and J). Full-length clones for FATP2 and 

FATPS and nearly complete sequences for FATP3 and 4 (Figure 1) were obtained by 
screening cDNA libraries made from mouse day 10.5 embryos and adult liver. Also 
identified were human homologues for each of the murine genes in the EST database. 
A sixth human gene was also identified; whether this gene is also present in the 

20 mouse will require additional studies. Map positions are given in Tables 2 and 3. 

The genetic loci for all of the human genes, with the exception of FATP5 
which was already mapped as an unknown EST, were determined using the radiation 
hybrid 

panels. The map positions given below show the distance (in centiRays) from the 
25 closest framework marker. As a guideline, there are approximately 300 kb/cR. 
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Table 2. Mapping Data for Human Genes 



hsFATPl 
hsFATP2 

5 

hsFATP3 

hsFATP4 

10 hsFATPS 
hsFATP6 



Chromosome Chrl9 
places 13.35 cR from WI-6344 (lod>3.0) 
Chromosome ChrlS 
places 4.92 cR from D15S126 (lod>3.0) 
/Chromosome Chrl 

places ,13.24 cR from WI-2862 (lod>3.0) 
Chromosome Chr9 

places 7.80 cR from WI-9685 (lod>3.0) 

unknown EST previously mapped to near D19S418 

Chromosome Chr5 

places 1.41 cR from WI-4907 (lod>3.0) 



The mouse map is an internal backcross panel consisting of 188 mouse 
backcross DNA's plus 4 controls (B6, Spretus, Fl, Water). The backcross was 

1 5 constructed by crossing B6 by Spretus animals and then crossing those Fl's back to 
B6. Mapping is accomplished by taking advantage of recombinational events during 
meiosis, and the use of PCR primers to detect the differences (by size or re-annealing 
events) at any given locus between the B6 and Spretus allele. 

For the purposes of mapping, a novel set of primers (gene of interest) 

20 is used to amplify from all 188 DNA's and then typed as being a B6 ("B") or a 

Spretus ("S")* This string of B's and S's is entered into the Map Manager program, 
which does a best fit calculation by comparing the string of 1 88 typings from the gene 
of interest to all loci already extant in the panel, for all 20 chromosomes. The gene of 
interest is then assigned to a particular area on a particular chromosome according to 

25 a number of parameters, including the minimalization of double cross-overs, and the 
highest LOD scores. Indicated in Table 3 are distances to the closest markers on 
either side of the FATP locus. 
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Table 3. Mapping Data for Mouse Genes 



mmFATPl 



Chromosome 8 



places 2.82 cM from D8Mitl32 (lod 43.4) and 1.81 cM from D8MU74 
(lod 43.5) 



5 



mmFATP2 



Chromosome 2 



places 1.29 cM from D2Mit258 (lod 47.9) and 1.75 cM from D2NDS3 
(lod 44.9) 



mmFATP3 Chromosome 3 

places 2.54 cM from D3Mit22 (lod 29.5) and 19.62 cM from D3Mit42 



mmFATP4 Chromosome 2 

places 13.78 cM from D2Mitl (lod 22.9) and 3.85 cM from D2Mit65 

(lod 41.9) 
mmFATPS Chromosome 7 



Example 2: Assessment of Function 

The ability of the newly identified mouse genes to function as fatty acid 
transporters was assessed using a fluorescence-activated cell sorting-based assay. 
COS cells were transiently cotransfected with expression vectors encoding the cell 

20 surface protein CD2 and either mmFATPl , mmFATP2, or mmFATPS, respectively. 
Two days after transfection, COS cells were stained with an antibody to CD2 and 
then incubated with a BODIPY-labeled fatty acid [BODIPY-FA, (Schaffer, J.E. & 
Lodish, H.F. (1994) Cell 7P;427-436)]. The cells were then washed extensively, 
lifted off the dish, and analyzed by fluorescence-activated cell sorting. As judged by 

25 the number of CD2-positive cells, the transfection efficiency was approximately 20- 
30%. Fatty acid uptake was quantitated in the transiently transfected COS cells by 
measuring the BODIPY-FA fluorescence of the CD2-positive cells. Expression of 
CD2 had no effect on fatty acid uptake as shown by the finding that COS cells 
expressing only the transfected CD2 cDNA (CD2 -positive) had the same low level of 



10 



(lod 13.6) 



15 



places 7.28 cM proximal of D7Mit21 (lod 28.3) 
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BODIPY-FA uptake as did untransfected (CD2-negative) control cells (Figure 2A, 
control). In COS cells cotransfected with CD2 and mmFATPl, mrnFATP2, or 
mmFATPS, uptake of BODIPY-FA by the transfected (CD2-positive) cells was 
increased between 15- to 90-fold over control (CD2 cDNA only) cells (Figures 2A- 
5 2D). 

Example 3: Expression Patterns of Murine FATPs 

Expression patterns of members of the murine FATP gene family were 
characterized by Northern blot analysis; to avoid cross-hybridization, the probes used 
were from the 3' untranslated region of these genes, which are less than 60% identical 

10 in sequence. The expression pattern of FATP 1 agrees with that previously found 
(Schaffer, J.E. & Lodish, H.F. (1994) Cell 79:421-436). Here, expression was seen 
primarily in heart and kidney. FATP2 is expressed almost exclusively in liver and 
kidney, which corresponds to the reported tissue distribution of the rat homologue 
[very long chain acyl-CoA (VLACS)] as assessed by Western blotting (Uchiyama, A., 

15 Aoyama, T-, Kamijo, K. 5 Uchida, Y., Kondo, N., Orii, T. & Hashimoto, T. (1996) J. 
Biol. Chem. 277:30360-30365). FATP3 is present in lung, liver, and testis. FATPS 
is expressed only in liver and cannot be detected in other tissues even when the blot is 
overexposed. The human homologue of FATPS is also liver specific and is not 
expressed in a wide array of other tissues tested, including fetal liver. 

20 Example 4: FATPs Are Evolutionarily Conserved 

The EST database was searched, using sequences conserved among the five 
murine FATP genes, for FATP genes in other organisms. Two homologues were 
found in C. elegans and one in M. tuberculosis. One of the C. elegans genes was 
cloned from a cDNA library and expressed in COS cells, as described for the murine 

25 FATPs. Overexpression of the nematode FATP resulted in a 1 5-fold increase of 
BODIPY-FA uptake compared with control cells (Figure 3). The mycobacterial 
FATP gene was isolated from a phage library and assessed for its ability to facilitate 
fatty acid uptake. £. coli transformed with a prokaryotic, isopropyl P-D- 
thiogalactoside-inducible expression vector containing the mycobacterial FATP gene 
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demonstrated a significant increase in the rate of [ 3 H]palmitate uptake after induction, 
compared with uninduced bacteria or E. coli transformed with a control protein 
(Figure 4). Novel FATP genes were also identified in F. rubripes (puffer fish) and D. 
melanogaster. 

5 Example5: Phylogenetic Tree of FATPs 

Faergeman et al. (Faergeman, N.J., DiRusso. C.C., Elberger, A., Knudsen, J. 
& Black, P. N. (1997) J. Biol Chem, 272:8531-8538) identified three regions of very 
strong conservation between the scFATP and mmFATPl genes. The sequences of the 
FATPS were compared over a 31 1 -amino acid FATP "signature sequence' 1 which 

10 includes these conserved regions corresponding to amino acids 246-557 in 

mmFATPl (underlined in Figure 1). When compared with the National Center for 
Biotechnology Information nonredundant database, only one region of the "FATP 
signature sequence" shows significant homology to other proteins. This small stretch 
of amino acids (underlined in Fig. 1) is an AMP-binding motif found in a multitude 

15 of other proteins, such as acyl-CoA synthase, several Co A lipases, and gramicidin S 
synthetase component U (Schaffer, J.E. & Lodish, H.F. (1994) Cell 79:427-436). The 
relevance of this motif to fatty acid transport is unclear. Other highly conserved 
regions among the FATPs, including long stretches of amino acids >90% identical 
from mycobacteria to humans, are not found in any other class of proteins. A 48- 

20 amino acid segment of the FATP signature sequence was used to construct a 

phylogenetic tree (Figure 5). Each of the human and mouse genes form their own 
branch; hsFATP6, which as yet has no murine homologue, is most closely related to 
hsFATP3 and mmFATP3. As expected, rnVLACS is closer in sequence to 
mmFATP2 than to hsFATP2. The FATP genes of invertebrates i.e., C. elegans and 

25 D. melanogaster, are most closely related to each other. Surprisingly, the 

mycobacteral gene is more closely related to the human and mouse FATPS genes than 
to the FATPs of any of the lower organisms. Whether this reflects coevolution of the 
mycobacterial and human genes awaits further study. 
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Materials and Methods 

The following materials and methods were used in the work described in 
Examples 6-10. 

Isolation of full-length human FATP1 and 4 
5 Full-length clones encoding human FATP1 and human FATP4 were 

identified by searching databases for sequences similar to murine FATP1-5 coding 
regions using the BlastX algorithm (Altschul et al, J. Mol Biol. 215: 403-410, 1990). 

A concatamer of nucleotide sequences comprising the coding sequences of 
mmFATPr (Genbank Accession U15976), mmFATP2, mmFATP3 (SEQ ID NO:6), 

10 mmFATP4 (SEQ ID NO:8) and mmFATP5 (SEQ ID NO: 10) was used to search the 
Millennium database using the BLASTX algorithm. Sequences with a score >150 
were evaluated for whether they represented known FATP coding sequences. 

Human clones with similarity to the 5' end of murine FATP sequences were 
sequenced completely. Clones encoding full-length human FATP1 were obtained 

15 from a heart cDNA library constructed in the mammalian expression vector pMET7 
(Tartaglia et al, Cell, 83: 1263-1271, 1995). Clones encoding full-length human 
FATP4 were obtained from a spleen cDNA library constructed in the mammalian 
expression vector pMET7. 

Isolation of full-length human FATP6 
20 Several clones encoding human FATP6 were identified by searching public 

databases as described above. Five clones were analyzed further by restriction 
digestion and DNA sequencing. One of these clones (Genbank Accession # 
AA4 12064) appeared to be full-length and its entire insert was sequenced. 

DNA Sequence Analysis 
25 Sequences were aligned with the DNAStar program using the Clustal method. 

Hydrophobicity plots were generated with DNA Strider using the Kyte Doolittle 
method. 
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In situ hybridization 

Tissues were collected from 8 week old C57/B16 mice. Tissues were fresh 
frozen, cut on a cryostat at 10 \im thickness and mounted on Superfrost Plus slides 
(VWR). Sections were air dried for 20 minutes and then incubated with ice cold 4% 
5 paraformaldehyde (PFA)/phosphate buffered saline (PBS) for 10 minutes. Slides 
were washed 2 times 5 minutes with PBS, incubated with 0.25% acetic anhydride/1 
M triethanolamine for 10 minutes, washed with PBS for 5 minutes and dehydrated 
with 70%, 80%, 95% and 100% ethanol for 1 minute each. Sections were incubated 
with chloroform for 5 minutes. Hybridizations were performed with 35 S-radiolabeled 

10 (5x1 0 7 cpm/ml) cRNA probes generated from the 3' untranslated regions of mouse 
FATPs by PCR followed by in vitro transcription in the presence of 50% formamide, 
10% dextran sulfate, lx Denhardt's solution, 600 mM NaCl, 10 mM DTT, 0.25% 
SDS and 10 [Ig/ml tRNA for 18 hours at 55°C. After hybridization, slides were 
washed with 10 mM Tris-HCl pH 7.6, 500 mM NaCl, 1 mM EDTA (TNE) for 10 

15 minutes, incubated in 40 fig/ml RNase A in TNE at 37°C for 30 minutes, washed in 
TNE for 10 minutes, incubated once in 2x SSC at 60°C for 1 hour, once in 0.2x SSC 
at 60°C for 1 hour, once in 0.2x SSC at 65°C for 1 hour and dehydrated with 50%, 
70%, 80%, 90% and 100% ethanol. Localization of mRNA transcripts was detected 
by dipping slides in Kodak NBT-2 photoemulsion and exposing for 7 days at 4°C, 

20 followed by development with Kodak Dektol developer. Slides were counter stained 
with haematoxylon and eosin and photographed. Controls for the in situ 
hybridization experiments include the use of a sense probe which showed no signal 
above background in all cases. 

Northern Blotting 

25 Human mRNA blots were obtained from Invitrogen or Clontech. PCR 

fragments from the 3' untranslated regions of human FATPs were used as probes. 
Blots were probed with 32 P-labeled DNA probes using the Rapid-Hyb buffer 
(Amersham, Buckinghamshire, UK) according to the manufacturer's instructions. 
Cell transfection and LCFA uptake. COS cells were cotransfected, using 

30 lipofectamine (GIBCO BRL, Rockville, MD) according to the manufacturer's 
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instructions, with the mammalian expression vector pCDNA3.1 (Invitrogen, 
Carlsbad, CA) expressing the gene for CD2 in combination with a pMET7 expression 
vector (Tartaglia et al, Cell, 83:1263-1271, 1995) containing hsFATPl (pMET7- 
hsFATPl) or hsFATP4 (pMET7-hsFATP4) or pMET7 alone. Two days after 
5 transfection, cells were assayed for CD2 expression with a phycoerythrin-coupled 
anti-CD2 (PE-CD2) monoclonal antibody (PharMingen, Franklin Lakes, NJ), and 
fatty acid uptake was assayed with a BODIPY-labeled fatty acid analog (Molecular 
Probes) as described above. 

Example 6: Determination of Expression of mrnFATPs 

10 mmFATP4, and to lesser extent mmFATP2, are expressed at high levels in 

the brush border layer of the small intestine. 

Cell transfection and LCFA uptake. COS cells were cotransfected, using 
lipofectamine (GEBCO BRL, Rockville, MD) according to the manufacturer's 
instructions, with the mammalian expression vector pCDNA3.1 (Invitrogen, 

15 Carlsbad, CA) expressing the gene for CD2 in combination with a pMET7 expression 
vector (Tartaglia et al, Cell, 83:1263-1271, 1995) containing hsFATPl (pMET7- 
hsFATPl) or hsFATP4 (pMET7 -hsF ATP4) or pMET7 alone. Two days after 
transfection, cells were assayed for CD2 expression with a phycoerythrin-coupled 
anti-CD2 (PE-CD2) monoclonal antibody (PharMingen, Franklin Lakes, NJ), and 

20 fatty acid uptake was assayed with a BODIPY-labeled fatty acid analog (Molecular 
Probes) as described above. 

Absorption of dietary fat requires transport of free fatty acids across the apical 
membrane of epithelial cells in the small intestine. Previous studies suggested that 
this transport is protein-mediated; however, the transport protein had not yet been 

25 identified. In situ hybridization was performed on each of the three regions of the 
small intestine — duodenum, jejunum and ileum — as well as the colon, using probes 
from the 3' untranslated regions of mmFATPl, mmFATP2, mmFATP3, mmFATP4 
and mmFATPS, to determine whether any of the mouse FATPs are expressed in the 
small intestine. It was expected that a protein involved in fatty acid absorption would 

30 be expressed in the epithelial cells of the small intestine, but absent from the colon. 
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Expression of mmFATPs in the jejunum was identical to that in the ileum in 
all cases. High levels of mmFATP4 mRNA were present in the epithelial cells of the 
jejunum and ileum, and lower, but significant, amounts were detected in the epithelial 
cells of the duodenum. Significantly, FATP4 mRNA was absent from other cell 
5 types of the small intestine and no FATP4 mRNA could be detected in any of the 

cells of the colon. FATP2 mRNA was present in the epithelial cells of the duodenum 
at a level similar to that of FATP4, but was present at lower levels in the jejunum and 
ileum. No signals above background were detected for mmFATPl, mmFATP3 and 
mmFATP5 in any of the intestinal tissues. mmFATP3 and FATP5 were clearly 

10 detectable by in situ hybridization in adult liver and mmFATPl could be detected in a 
variety of tissues on a whole embryo in situ, indicating that the FATP1, 3, and 5 
probes were working. 

mmFATP4 expression is predominant in the small intestine compared to the 
other organs of the mouse embryo. In the small intestine, FATP4 expression is 

15 limited to differentiated enterocytes, while no signal is detected in the connective 

tissue or the undifferentiated epithelial cells in the crypts. Differentiated enterocytes 
are known to be the cells that mediate the uptake of fatty acids. FATP4 is specifically 
and strongly expressed in the epithelial cells of adult murine duodenum and ileum but, 
not colon. Other FATPs, such as FATP5, are not expressed in the small intestine. 

20 Thus, FATP4 is the major FATP in the mouse small intestine. Given its high level of 
expression, it is likely that FATP4, and to a lesser extent FATP2, play an important 
role in the absorption of fatty acids. 

mmFATP2, and mmFATPS are expressed in hepatocytes 

Northern analysis of mmFATP2, mmFATP3, mmFATP4 and mmFATP5 

25 showed expression in the liver. To determine whether these proteins are present in 
hepatocytes or other cells types present in liver homogenates, in situ hybridizations 
were performed. mmFATP2, and mmFATPS mRNA was clearly present in 
hepatocytes, and was not concentrated in other cell types such as endothelial cells or 
macrophages. No signal above background was detected for mmFATPl in any of the 

30 cell types in the liver, consistent with the results of the Northern blotting. 



BNSDOCID: <WO 0121795A2J_> 



WO 01/21795 



PCT/US00/25891 



-80- 

Ex ample 7: Isolation and Sequence Analysis of Full-length Human FATP1 and Full- 
length Human FATP4 

To identify human cDNA clones encoding FATP family members, 
Millennium databases were searched for sequences similar to murine FATP 1-5 
5 coding regions. Two clones were analyzed in detail; inspection of the entire DNA 
sequence of these two clones showed that they encode the human orthologs of 
mmFATPl and mm FATP4 3 respectively. These two clones were designated 
hsFATPl and hsFATP4, and their DNA and predicted protein sequences are shown 
in Figures 44A-44C and 45, and 50A-50C and 51. hsFATPl is predicted to encode a 

10 646 amino acid, 71 kD protein with multiple membrane-spanning domains (Figure 
28A). HsFATP4 is predicted to encode a 643 amino acid, 72 kD protein with 
multiple membrane spanning domains (See Figure 29A). A comparison of the DNA 
sequences of mouse and human FATP1 and mouse and human FATP4 (Figures 30A- 
30B and 31 A-31B) shows that the mouse and human orthologs are 85% (FATP 1> and 

15 87% (FATP4) identical to each other within the coding sequences given in these 

figures. At the amino acid level, hsFATPl and hsFATP4 are -90% identical to their 
respective mouse orthologs within the coding region shown in these figures (Figures 
32 and 33). The sequence identities between mouse and human FATP1 and FATP4 
are considerably higher than the ones observed between different FATP family 

20 members within one species (~~40%-60%) and are present in the N-terminal part of 
the protein, a region that is poorly conserved between different FATP family 
members. This high degree of sequence conservation clearly demonstrates that the 
newly identified human FATPs are orthologs of mouse FATP1 and FATP4 rather 
than novel FATP family members. 

25 Table 4 is an identity/similarity matrix comparing the amino acid sequences 

of FATP1 and 4 from human and mouse. This shows that the gene whose sequence 
is shown in Figure 43 A is indeed human FATP4, since it is 91% identical with the 
murine FATP4 but only 62% identical with the closest related human FATP, which is 
FATPL 
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Table 4 


Identity/Similarity Matrix 




hsFATP4 


mmFATP4 


hsFATPl 


mmFATPl 


hsFATP4 




93.2 


72.3 


72.0 


mrnFATP4 


91.0 




71.2 


71.1 


hsFATPl 


61.9 


61.0 




92.4 


mmFATPl 


60.7 


59.6 


89.5 





Example 8: Isolation and Sequence Analysis of Full-length Human FATP6 

A search of EST databases identified a set of overlapping human sequences 
that were similar to FATPs, but did not have a clear mouse ortholog. One of these 

10 EST clones was found to encode a full-length cDNA. The entire insert of this clone 
was sequenced and designated hsFATP6. The DNA and predicted protein sequences 
of hsFATP6 are shown in Figures 54A-54C and 55. HsFATP6 is predicted to encode 
a 619 amino acid, 70 kD protein with multiple membrane-spanning domains (Figure 
35A). A comparison of the amino acid sequences of hsFATP6 with other human 

15 FATPs shows about 37% identity to either hsFATPl or hsFATP4 (Figure 36). This 
degree of sequence identity is similar to what is observed between different mouse 
FATPs. The phylogenetic analysis described above clearly demonstrates that 
hsFATP6 is a member of the FATP family, but not an ortholog of any of the mouse 
FATPs. Comparisons were done with "ALIGN" (E. Myers and W. Miller, "Optimal 

20 Alignments in Linear Space," CABIOS 4:1 1-17 (1988) using standard settings. 

Example 9: Tissue Distribution of Human FATPs 

The tissue distribution of human FATPs was assessed by Northern blotting. 
Human FATP3 was expressed in a large variety of tissues. In contrast, human 
FATPS was present at high levels in the liver, but was undetectable in all other 
25 tissues examined. Thus, both hsFATP3 and hsFATPS recapitulate the expression 
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pattern of their mouse orthologs (see above). HsFATP6 is a novel FATP with no 
mouse ortholog as yet. Northern blotting shows that hsFATP6 is expressed at high 
levels in the heart, but is undetectable in other tissues, including skeletal and smooth 
muscle. This tissue distribution suggests that human FATP6 performs an important 
5 role in energy metabolism in the heart; blocking FATP6-mediated fatty acid transport 
may therefore be beneficial for a number of heart diseases, e.g., ischemic heart 
disease. 

To identify the major FATP expressed in the human small intestine, Northern 
blotting was performed on a blot containing mRNA from human stomach, jejunum, 

10 ileum, colon, rectum and lung. hsFATPS and hsFATP6 were undetectable in any of 
these tissues. FATP5 is only expressed in liver and FATP6 only in heart. hsFATP2 
was weakly expressed in the colon, and an even weaker signal was detectable in 
jejunum, ileum and lung lanes. hsFATP3 was, expressed well in the lung, but was 
only weakly expressed in the other tissues tested. Importantly, no difference was seen 

15 in the expression of hsFATP3 between small intestine and stomach or colon, 

suggesting that the expression observed is not related to fatty acid absorption in the 
small intestine. hsFATP4 was clearly expressed in both jejunum and ileum; 
expression was significantly lower in the colon and was absent in the stomach. This 
expression pattern is consistent with a major role for FATP4 in absorption of fatty 

20 acids in the human gut. 

Example 10: Expression of hsFATPl and hsFATP4 Promotes Transport of Fatty 
Acids 

COS cells were cotransfected using lipofectamine with the mammalian 
expression vector pCDNA-CD2 in combination with one of the FATP-containing 

25 expression vectors (pMET7-hsF ATP 1 or pMET7-hsFATP4) or an insertless 
expression vector (pMET7, control) as described in Materials and Methods for 
Examples 6-10. COS cells were gated on forward scatter and side scatter. Cells 
exhibiting more than 400 CD2 fluorescence units representing -30% of all cells were 
deemed CD2-positive. The percent of CD2-positive cells exhibiting a BODIPY- 

30 fluorescence of >300 is plotted for the three different vectors tested (Figure 37). 
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Example 1 1 : Stable Expression of Human FATP4 in 293 Cells 

Stable cell lines were generated as follows. A DNA fragment containing the 
entire hsFATP4 coding sequence as well as 100 nucleotides of 5' and 50 nucleotides 
of 3' untranslated region was inserted into the vector pIRES-neo (Clontech, Palo Alto, 
5 CA) using standard cloning techniques. The resulting construct or a vector control 
(pIRES-neo) was transfected into 293 cells using the lipofectamine method (Gibco 
BRL 5 Rockville, MD) according to the manufacturer's directions. Cells that had 
taken up the DNA were selected with 1 mg/ml G41 8 (Gibco BRL, Rockville, MD). 
Single colonies were picked 1 to 2 weeks after transfection and grown in medium 

10 containing 0.8 mg/ml G418. Colonies were screened for the ability to take up fatty 
acids by measuring uptake of a fluorescently labeled fatty acid (BODIPY-FA). About 
40 colonies transfected with the pIRES-neo containing FATP4 and -20 colonies 
transfected with pIRES-neo control were analyzed. All 20 of the vector control 
clones showed amounts of BODDFY-FA uptake similar to each other and to 

15 untransfected 293 cells. In contrast, among the 40 FATP4 transfected clones, 3 had a 
5- to 10-fold increased BODIPY-FA uptake compared to any of the vector controls, 
and a large number (~20) showed an approximately two-fold increase in BODIPY- 
FA levels. This distribution is consistent with FATP4 conferring increased fatty acid 
uptake in these cells. One of the cell lines with the highest amount of BODIPY-FA 

20 uptake was selected to be used for measuring uptake of tritiated fatty acid. 

The uptake of tritiated oleate over time by either FATP4 expressing or control 
cells was assayed over time. Expression of FATP4 increases the rate of fatty acid 
uptake by over 3-fold, demonstrating that FATP4 is, like the other FATPs, a 
functional fatty acid transporter (Figure 38). 

25 Example 12: Immuno-staining with FA TP4-Specific Antiserum 

A polyclonal antiserum against the C-terminus of mmFATP4 was raised using 
a GST-fusion protein having mmFATP4-specific amino acid sequence 552-643 
(AVASP...GEEKL). In western blot experiments, the purified antibody reacted 
strongly with a synthetic peptide matching the C-terminus of mmFATP4, but not with 

30 a corresponding region of mmFATP2, mmFATP3, or mmFATPS. The mmFATP4 
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specific polyclonal antiserum detects, in western blot experiments with enterocyte 
lysates from 3 different mice, a -70 kDa protein, which is in accordance with 
mmFATP4's predicted molecular weight of 72 kDa. The binding is specific for 
mmFATP4, since it can be completely abolished by preincubation of the antiserum 
5 with the GST-fusion peptide used to raise the antibody. 

Immunofluorescence experiments were performed using the anti-mmFATP4 
antiserum on fresh frozen sections of murine small intestine. The antibody binding 
demonstrates strong expression of mmFATP4 in enterocytes, confirming the results 
of the in situ hybridization experiments. At higher magnifications it is apparent that 

10 mmFATP4 is expressed at the apical side of the enterocyte, indicating that the 

transporter is present in the brush border membrane, which is known to mediate the 
uptake of fatty acids from the intestinal lumen. 

Immuno-electron microscopy studies were performed on fresh frozen murine 
intestinal cells. The gold particles used, appearing as black specks on the electron 

1 5 micrographs, indicate the subcellular localization of mmFATP4 to be on the 
microvilli of the enterocyte. It can be seen from electron micrographs that 
mmFATP4'is localized exclusively in membranes, preferentially the apical plasma 
membrane, confirming that it is indeed a membrane protein. 

Methods for Immunofluorescence and Immunogold Electron Microscopy 
20 Unfixed mouse small intestine was washed with Hank's buffered salt solution 

containing 1 mM EDTA, infused with 2.3 M sucrose solution, and embedded in 
O.C.T., 4583 compound. The material was thick sectioned (15 [iM - 40 |J.M). The 
sections were washed in PBS containing 1% BSA and 0.075% glycine to block non- 
specific binding. Primary and secondary antibodies were diluted in PBS with 10% 
25 FCS and incubated for lh. The sections were mounted in 90% glycerol/PBS 

containing 1 mg/ml paraphenylinediamine, and examined with a Bio-Rad MRC 600 
confocal, mounted on a Zeiss Axioscop. 

For the immunogold labeling, the tissue was fixed with 2% paraformaldehyde 
in PBS for 10 minutes, after which it was cryoprotected by infiltration with 2.3 M 
30 sucrose in 0. 1 M phosphate buffer (pH 7.4) containing 20% polyvinylpyrrolidone, 



BNSDOCID: <WO 0121795A2_I_> 



WO 01/21795 



-85- 



PCT/US00/25891 



and then mounted on aluminum cryo nails and frozen in liquid nitrogen (Tokuyasu, 
K.T., J. Microscop. 745:139-149, 1986). Ultrathin sections were collected on 
carbon/formvar-coated nickel grids. The primary antibody (anti-FATP4) was diluted 
in 10% FCS in PBS and incubated overnight at 4° C, followed by donkey anti-rabbit 
5 IgG-gold (12 nm) (Jackson Labs) for Ih. The sections were stained in 2% neutral 
uranyl acetate (20 minutes) and absorption stained with 2% uranyl acetate in 0.2% 
methylcellulose containing 3.2% polyvinyl alcohol. The sections were examined 
with a Philips EM 410 electron microscope. 

Example 13: Inhibition of Fatty Acid Uptake Specific to FATP4 Demonstrated in 
1 0 Isolated Mouse Enterocytes 

Phosphorothioate derivatives of the following oligonucleotides were 
synthesized: 

FATP4-AS2 CCCCCACCAGAG AGGCTCC (SEQ ED NO: 1 03) 

FATP4-AS2MM CC ACCCCCGGAAAGCCTGC (SEQ ID NO: 1 04) 
1 5 FATP4-S2 GGAGCCTCTCTGGTGGGGG (SEQ ID NO: 1 05) 



FATP4 AS2 is the antisense oligo; it is designed to be complementary to the 
sequence extending from nucleotide 10 to nucleotide 28 of the mouse FATP4 coding 
sequence. FATP4-AS2MM is a control oligo; in the oligo every third nucleotide was 
changed creating mismatches; the overall nucleotide composition is identical to 

20 FATP4-AS2 (same number of G, A, T, C). FATP4-S2 is the sense control. 

Enterocytes were isolated from the small intestine of mice and incubated for 
48h in tissue culture (Figure 40) either without oligonucleotides (squares) or with 100 
JIM FATP4 specific sense (circles) or antisense (diamonds) oligonucleotides. The 
uptake over time of 25 \xM oleate was then measured. While the FATP4 sense 

25 oligonucleotide did not significantly influence the uptake, the antisense 
oligonucleotide inhibited fatty acid uptake by -50%. 

The effect of either FATP4 sense, antisense or mismatch sequence 
oligonucleotides on the uptake of fatty acids was measured in enterocytes. Isolated 
enterocytes were incubated with increasing concentrations of FATP4 antisense 
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oligonucleotides (solid bars in Figure 41), or a mismatch control oligonucleotide with 
identical nucleotide composition (stippled bars), or with 100 |~LM of the FATP4 
sense-oligonucleotide (lined bar). The medium for this incubation was Dulbecco's 
modified Eagle's medium with 4.5 g/L glucose, 1 mM sodium pyruvate, 0,01 mg/ml 
5 human transferrin and 10% fetal bovine serum. After 48 hours of incubation the 
uptake of oleate by enterocytes was measured over a 5 minute time interval. 
Measurements were done in quadruplicate. The uptake assay was done in Hank's 
buffered salt solution with 10 mM taurocholate. Only the enterocytes given FATP4 
antisense oligonucleotide showed a concentration dependent decrease of fatty acid 

10 uptake, inhibiting it at a 100 [iM concentration by -50%. This effect was FATP4 
specific, since only the antisense oligonucleotide which can bind to the FATP4 
mRNA and block its translation inhibited uptake, but not a control oligonucleotide 
differing only in the sequence but not the nucleotide content, ruling out a toxic or 
otherwise nonspecific inhibitory effect of this oligonucleotide due to its chemical 

15 composition. 

As a further control experiment, the uptake of oleate was measured along with 
the uptake of methionine in the same cultured enterocytes. Antisense 
oligonucleotide, mismatch sequence oligonucleotide, or no oligonucleotide was 
added to a concentration of 100 |J.M to cultures of enterocytes. After incubation for 

20 48 hours, the uptake of both 3 H-labeled oleate and 35 S-labeled methionine was 

assayed. Results are shown in Figure 42. Fatty acid uptake is at the left side of the 
paired bars; methionine uptake is on the right side of the paired bars. The fact that 
amino acid uptake was not influenced by the antisense oligonucleotide treatment 
further supports the conclusion that the antisense oligonucleotide causes a specific 

25 reduction in translation of FATP4-specific mRNA. 

Example 14: mmFATP2 Is Expressed in Proximal Renal Tubule Epithelium 

Northern analysis showed that mmFATPl, mmFATP2, and mmFATP4 are 
present in the kidney. In situ hybridization (methods as for Example 6) was 
performed to determine which cell type(s) of the kidney these mRNAs are expressed 
30 in. mmFATPl mRNA was present in virtually all cells throughout the kidney with 
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no obvious preference for a particular cell type. In contrast, mmFATP2 was 
expressed only in the renal cortex. Within the cortex, expression of mmFATP2 was 
restricted to the epithelial cells of the proximal renal tubules. The primary function 
of proximal renal tubule cells is the reabsorption of filtered salts and nutrients (e.g., 
5 glucose), a process that requires mitochondrial oxidation and that can utilize fatty 
acids as energy substrates. Based on the localization of mmFATP2, it is possible that 
mmFATP2 is important for reabsorption in the kidney by allowing uptake of an 
energy source (fatty acids) from the blood into renal epithelial cells. Alternatively, if 
fatty acids need to be reabsorbed in the kidney, similarly to glucose, FATP2 could be 
10 involved in the reabsorption of fatty acids. Determination of the subcellular 
localization of FATP2 will distinguish between these two possibilities. 
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obtained from a human bone library constructed in the mammalian expression vector 
pMET7 (Tartaglia, L.A. et aL, Cell 83: 1263-1271, 1995). To identify human cDNA 
clones encoding FATP family members, databases were searched for sequences 
similar to murine FATP 1-5 coding regions. One clone was found to encode the 
5 human ortholog of mmFATP3 and was designated hsFATP3. The DNA and 
predicted protein sequences of hsFATP3 are shown in Figures 94A and 94B. 
hsFATP3 is predicted to encode a 702 amino acid 75.6 kD protein with multiple 
membrane-spanning domains. A comparison of the DNA sequences of mouse and 
human FATP3 shows that the mouse and human orthologs are 81% identical to each 

10 other within the coding region. At the amino acid level, hsFATP3 is - 86% identical 
to mm FATP3 within the coding region. The sequence identities between mouse and 
human FATP3 are considerably higher than those observed between different FATP 
family members within one species (-40%) and are present in the N-terminal part of 
the protein, a region that is poorly conserved between different FATP family 

1 5 members. 

Example 16: Substrate Specificity of Fatty Acid Transport in hsFATP-Transfected 
Clones 

Using.a mammalian expression vector, we generated 40 stable 239 cell lines 
expressing hsFATP4 and 20 cell lines transfected with a control plasmid. The ability 

20 of the different cell lines to take up FA, as assessed by uptake assays using the 

fluorescently labeled Bodipy-palmitate, correlated well with their FATP4 expression 
levels determined by Western blotting (FIG. 95). All 20 vector control clones showed 
amounts of Bodipy-FA uptake similar to each other and to un transfected 239 cells. In 
contrast, among the 40 FATP4 transfected clones, a large number (-20) showed an 

25 approximately 2-fold increase in Bodipy-FA uptake compared to any of the vector 
controls, and three had a 5- to 10-fold increase in Bodipy-FA uptake. 

Several of the cell lines with the highest amount of Bodipy-FA uptake as well 
as isolated primary enterocytes were used to measure the uptake of radiolabeled FAs. 
Short-term uptake by 293 cells and enterocytes of all FAs tested was linear (FIG. 97). 

30 hsFATP4 expression enhanced the rate of palmitate uptake approximately 3 fold over 
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293 cells transfected with vector alone (FIG. 97) and also accelerated the uptake of 
oleate but not of linolate, arachidonate, octanoate, butyrate or cholesterol (Table 6). 
Isolated primary enterocytes showed a similar preference for palmitate and oleate, 
and absence of transport of arachidonate, octanoate, and butyrate, but displayed a 
5 more robust transport of linolate and cholesterol than the transfected 293 cells. 

To further characterize the substrate specificity of FATP4, we measured the 
uptake by stably transfected 293 cells of 5 |J.M Bodipy-FA in the presence of a 20 
fold molar excess (i.e., 100 |IM) of FAs, FA-derivatives and lipid soluble vitamins 
and hormones. Both saturated and non-saturated fatty acids containing 10 to 26 C 
10 atoms strongly competed for uptake of Bodipy-palmitate (FIG. 96 and Table 7) and 
thus are presumed to be substrates of FATP4. In contrast, fatty acids with eight or 
fewer C atoms did not compete and thus are presumed not to be FATP4 substrates. 
Similarly, esters of long chain FAs and other hydrophobic molecules tested had no 
effect on uptake of Bodipy-palmitate. 

1 5 LCFA Uptake Assays (Methods) 

Bodipy-FA uptake assays using FACS were performed, adapted to a 96-well 
format. LCFA uptake assays with enterocytes or with stably transfected 293 cells 
were done as follows. Mixed micelles of radiolabeled FA (NEN) and taurocholate 
(Sigma) in HBS were generated by brief sonication at 37°C. Equal volumes of cells 

20 and micelle solution were mixed, resulting in a final FA concentration of 25 \lM for 
antisense assays and 10 |J.M for substrate specificity assays. Final taurocholate 
concentration was 5 mM. Cells were incubated for the indicated amount of time at 
37°C. The assay was stopped by transferring the cells onto filter paper followed by 
extensive washes with ice-cold HBS containing 0.1% BSA using a cell harvester 

25 (Brandell). Incorporated oleate was then determined by P-scintillation counting 
(Beckman). 



Table 6 
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Uptake of Different Substrates by FATP4 Expressing Cell Lines and 
Enterocytes 



Fatty Acid 


293 Cells 
Control* 


293 Cells 

Stably 
Expressing 
FATP4 


FATP4 
specific 


Enterocytes* 


Palmitate 


564 


1695 


1131 


3036 


Oleate 


662 


1122 


459 


117 


Linolate 


640 


673 


33 


116 


Arachidonate 


3 


5 


2 


0 


Octanoate 


0 


0 


0 


5 


Butyrate 


0 


50 


50 


73 


Cholesterol 


319 


345 


26 


531 



10 



15 



Uptake of different substrates by enterocytes and by control 
and stable FATP4-expressing 293 cells. The rates of uptake 
for the indicated fatty acids was measured oyer 4 min taking 
measurements every 30 s. All fatty acids were at a 
concentration of 10 |iM in HBS containing 5 mM taurocholate. 



*Uptake measured as pmol/min 10 6 cells 



Table 7 

Competition of Bodipy-FA Uptake by FATP4 Expressing Cells 



Fatty Acids 


Formula 


Competition 


Butyric Acid 


C 4 H 8 0 2 
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^2^2 
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C H O 
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C H O 


++ 


Avf\/i*i ctir* AnH 
iviyr lane Atiu 




-}— |- 


"Palmitir* Af*"iH 
ST aiillltic aliu 


^16 n 32 u 2 


HH- 




r h .o 

v ^18 ri 36 v -'2 


-f 


OIpiV ApiH 


^'I8 jri 34 w 2 


++ 


Linoleic Acid 




-H- 


Arachidic Acid 


C20H40O2 


-H- 


Lignoceric Acid 


C24H48O2 


-H- 


Cerotic Acid 


C 2 6 H 5202 


+4- 



Fatty Acid Derivatives 



Fatty Acids 


Formula 


Competition 


Palmitic Acid Methyl 
Ester 


Q7H34O2 




Stearic Acid Methyl Ester 


Ci9H 38 0 2 




Oleic Acid Ethyl Ester 






Oleic Acid Oley Ester 


^36^68^2 




Oleoyl CoA 


C 39 H 68 N 7 0 I7 P 3 S 




Cholesteryl Oleate 


C 4 5 H 78°2 





Table 7 Continued 

Competition of Bodipy-FA Uptake by FATP4 Expressing Cells 
Lipid-Soluble Vitamins & Hormones 
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Fatty Acids 


Formula 


Competition 


Retinoic Acid (Pro- Vitamin A) 




± 


Ergocalciferol (Vitamin D2) 


C28H44O2 




Tocopherol (Vitamin E) 


C29H50O2 




3-Phytylamenadione (Vitamin 
Kl) 


C31H46O2 




Prostaglandin E2 


C 2 oH 32 0 5 





Competition for Bodipy-FA uptake by FATP4 expressing cells by 
different hydrophobic compounds. The uptake of 5 [iM Bodipy-FA, 



10 Cl-Bodipy-C12 was measured in the presence of a 20- fold molar 
excess (i.e., 100 \iM) of the indicated fatty acids or fatty acid 
derivatives. The maximal 100% inhibition was defined as the 
amount of Bodipy-FA incorporated in the presence of 200 |iM lauric 
acid which was on average 18% ± 5% that of untreated cells. 

1 5 -: 0% - 30% inhibition by the indicated substance 
±: 30% - 50% inhibition 
+: 50% - 70% inhibition 
++: 70% - 100% inhibition 
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Exainple 17: Identification and Characterization of the FATP5 Promoter 
METHODS 

BAC Isolation and Luciferase Constructs 

An arrayed BAC library was screened by PCR for FATP5 genomic clones. 
5 PCR primers designed by a program from the Whitehead Institute's Genome Center 
specifically amplified a single band of the correct size from mouse genomic DNA. 
Two putative B ACs containing the FATP5 genomic sequence were identified and the 
presence of FATP5 sequence was confirmed by dot hybridization of the BAC with 
the mmFATPS cDNA. 

10 After isolation of positive BACs, large amounts of bacteria were grown and 

DNA prepared using a Qiagen maxi-prep kit (Qiagen, Venlo, The Netherlands). The 
BAC was digested with Sac I and ligated into pZero-2 (Invitrogen, Carlsbad, CA). 
Inserts containing mmFATPS genomic sequence were identified by screening colony 
lifts of the ligation with an a- 32 P-ATP radiolabeled, random primed (Boehringer- 

15 Mannheim, Indianapolis, IN) mmFATPS cDNA as a probe. Positive colonies were 
picked and restriction analysis with Sac I revealed them to contain an identical, large 
insert of 8-10 kb. Digestion of the Sac I fragment with BstX I yielded three pieces 
that were subsequently subcloned into pZero and sequenced using an ABI sequencer 
(Research Genetics). A 1.3 kb piece containing sequence immediately upstream of 

20 the FATP5 initiator methionine was subcloned into the Xho I and Bgl II sites of the 
promoter-less pGL3 luciferase reporter vector (Promega Corp., Madison, WI). 7 kb 
of additional upstream sequence was subcloned into the Xho I and Sac I sites of the 
prior construct to yield a final construct containing approximately 8 kb of genomic 
sequence upstream of the initiator methionine. Deletions of the FATP5 promoter 

25 were constructed using PCR with the 1 .3 promoter construct as the template. 

Products were amplified with primers containing Hind IE (5' primer) and Xho I (3' 
primer) sites using Elongase (Gibco, Rockville, MD). The resulting fragments were 
cut with Hind EH and Xho I and subcloned into the corresponding sites of the 
promoter-less pGL3 luciferase reporter vector. The internal 30 base pair deletions, 

30 GC box mutations, and 10 nucleotide linker scan were all created with the 
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Quickchange mutagenesis kit (Stratagene, La Jolla, CA) according to the 
manufacturer's instructions. At least two different bacterial colonies were picked for 
each construct. The inserts from both colonies were sequenced to check for 
unintended point mutations and both constructs were assayed for luciferase activity. 

5 Cell culture, Transfection, and Luciferase Measurements 

HepG2, Hep3B, HT1080, 3T3-L1, BOSC, and HACAT cells were grown in 
DMEM supplemented with 10% fetal calf serum, 1 x penicillin-streptomycin and 
glutamine (Gibco, Rockville, MD). Mink lung cells were grown in MEM 
supplemented with 10% fetal calf serum, 1 x minimal essential amino acids, 1 x 

10 penicillin-streptomycin and glutamine. The evening prior to transfection, cells were 
plated at 50-60% confluence in 24 well dishes. The following morning, cells were 
placed in 2 mis of fresh media and 250 \xL of a CaP0 4 solution (Invitrogen, Carlsbad, 
CA) containing 2 jig of a luciferase reporter construct and 0.5 |ug of pCMV-P-gal 
was added to the cells. pCMV-P-gal constitutively expresses (3-galactosidase and 

15 was used to normalize transfection efficiency (Hua et al., 1998). After 12 hours, the 
cells were washed twice with DMEM and placed in fresh media. Thirty six hours 
later, the media over the cells was removed and 250 |lL of 1 x reporter lysis buffer 
(Promega Corp., Madison, WI) was added. After vigorous shaking for 15 minutes at 
room temperature, the supernatants were transferred to Eppendorf tubes and briefly 

20 centrifuged to remove particulates. 20 \lL from these tubes was used for 

determination of luciferase activity (Promega Corp., Madison, WI) and 20 \lL was 
used for the measurement of P-galactosidase activity (Clontech, Palo Alto, CA). All 
luciferase values were normalized to P-galactosidase to control for transfrection 
efficiency and expressed as relative luciferase units (RLU). For experiments 

25 comparing different cell lines, promoter activity was computed as a fold induction by 
dividing the RLU activity of either the -8 or -271 promoter constructs by the RLU 
activity a promoter-less construct. Each data point was done in triplicate and each 
experiment was repeated a minimum of three times. 
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Northern Blots, Preparation of Nuclear Extracts, and Gel Shift Assays 

Human poly-A northern blots were purchased from a commercial vendor 
(Clontech, Palo Alto, CA) and probed with a piece of the human FATP5 3' 
untranslated region specific for FATP5. Nuclear lysates from HepG2 and BOSC 
5 cells were essentially prepared according to the method of Hua et al. and stored at - 
80°C (Hua et al., 1998). Probes for gel shift assays were end labeled using T4 
polynucleotide kinase (Boehringer-Mannheim, Indianapolis, DM) and gel purified. 
Gel shifts were performed at room temperature in 30 \iL reactions comprised of 6 |lL 
5 X binding buffer (100 mM Tris 8.0, 300 mM KC1, 5 mM EDTA, 8 mM MgCl 2 , and 
10 36% glycerol), 0.5 \lL of 100 mM DTT, 1 [iL of 10 mg/ml BSA, 2 |iL of 2 mg/ml 
poly dl/dC, and 5 \Xh nuclear lysate. Ten minutes after the addition of nuclear lysate, 
40,000 cpm of 32 P-labeled probe were added. After 20 minutes at room temperature, 
loading dye was added and the reaction run on a 4% non-denaturing gel. 



RESULTS 

15 Human FATP5 mRNA is only expressed in adult liver 

We had previously reported that mmFATPS mRNA was only expressed in the 
liver (Hirsch et al., 1998). To determine if the human isoform of FATP5 was also 
liver specific, we performed northern analysis using a probe from the 3' transcribed 
but untranslated region of the human gene. Similar to the mouse homolog, hsFATPS 

20 is liver specific. Interestingly, hsFATPS was not expressed in fetal liver suggesting 
that it may be developmentally regulated. 



Identification of a FATP5 promoter 

We next set out to determine the cis-acting elements responsible for liver 
specific expression of FATP5. We identified BACs containing the FATP5 genomic 
25 locus and subcloned a 10 kb Sac I fragment which was subsequently sequenced. The 
Sac I fragment contains approximately 8 kb of genomic sequence upstream of the 
FATP5 initiator methionine. Blast searches using the 5 5 end of the Sac I sequence 
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revealed that it contained coding sequence for an unknown gene immediately 
upstream of FATP5. Since the FATP5 promoter is unlikely to overlap the coding 
sequence of another gene, we hypothesized that the 10 kb Sac I fragment contained 
the FATP5 promoter. To test this hypothesis, 8 kb of genomic DNA upstream of the 
5 translational initiator of FATP was subcloned into the promoter-less pGL3 luciferase 
reporter vector. This construct was transiently transfected into the HepG2 liver cell 
line and luciferase activity was determined. The -8 kb piece of DNA resulted in a 35 
fold induction of luciferase activity when compared to a pGL3 vector without the 
FATP5 genomic sequence (FIG. 100). To determine if this activity reflected tissue 
10 specific transcription, the -8 kb luciferase reporter construct was transfected into a 
variety of additional cell types. While promoter activity was also detected in the 
Hep3b hepatoma cell line, non-liver cell lines did not express luciferase above the 
level of the promoter-less vector. Thus, the 8 kb upstream genomic element 
recapitulated liver specific expression in vitro, 

15 The FATP5 promoter resides within the 261 base pairs upstream of the initiator 
methionine and requires a single GC box 

To determine the cis-acting elements in the -8 kb of genomic sequence 
responsible for transcriptional activity, serial 5' deletions of the promoter were 
constructed and transfected into HepG2 cells. Surprisingly, greater than 90% of the 

20 -8 kb was dispensable for promoter activity. A construct containing only 261 base 
pairs upstream of the initiator methionine resulted in promoter activity equivalent to 
that of the -8 kb construct (FIG. 101). Identical results were obtained when the 
deletion series was transfected into Hep3b cells (data not shown). We next 
determined if promoter activity of a small genetic element was tissue specific. 

25 Transfection of a construct containing 271 base pairs upstream of the initiator 

methionine into a variety of cell lines essentially replicated the results of the -8 kb 
construct in that expression was observed only in liver derived cell lines (FIG. 102). 

Since deletion analysis revealed that bases between -261 and -218 were 
required for promoter activity, we closely examined this region for binding sites of 

30 known transcription factors and found the sequence GGGGCGGGG between 



BNSDOCID: <WO 012179SA2_I_> 



WO 01/21795 



PCT/US00/25891 



-99- 

nucleotides -241 and -232 (FIG. 103 A). This sequence binds the Spl family of 
transcription factors and is termed a GC box. To determine if the activity of the -271 
construct required the GC box, we mutated the GC box. The first construct deleted 
nucleotides -241 to -222 which removed the GC box and additional downstream 
5 sequence which, although less optimal, might also bind the Spl family of 

transcription factors(SEQ ID NO.: 107). The second construct had three G to A point 
mutations in the GC box between nucleotides -241 to -232(SEQ ID NO.: 108). Such 
mutations had previously been shown to abolish transcriptional activity of GC boxes 
(Rodenburg et al M 1997). In contrast to the wild type -271 promoter, both of the 

10 mutated constructs were transcriptionally inactive in HepG2 cells (FIG. 103B). 

Identical results were also obtained in Hep3B cells (data not shown). This suggests 
that the GC box between -241 to -232 is essential for transcriptional activity of the 
FATP5 promoter. We next examined whether the sequences necessary for luciferase 
activity also bound proteins in nuclear extracts from HepG2 cells. Two different 

15 oligonucleotides were used for gel shift analysis. One oligonucleotide (AF-1) 

contained nucleotides -250 to -230(SEQ ID NO.: 1 1 1) and the other (AF-2) spanned 
nucleotides -260 to ~-200(SEQ ID NO.: 109) (FIG. 104). Both oligonucleotides 
yielded three significant complexes from HepG2 nuclear extracts. All complexes 
were specific as 100 fold excess of the same unlabeled oligonucleotide could compete 

20 for binding of the radiolabeled oligonucleotide. Mutant AF-1 oligonucleotides 

containing three point mutations in the GC box did not bind any proteins in HepG2 
nuclear extracts or compete for binding of nuclear proteins to the AF-1 or AF-2 
oligonucleotides (data not shown). Oligonucleotides AF-1 and AF-2 also bound 
recombinant Spl (Promega Corp, Madison, WI, data not shown). However, nuclear 

25 extract from BOSC cells, a kidney cell line, and HepG2 cells had identical patterns of 
complex formation (data not shown). 

Identification of novel sequences required for transcriptional activity of the FATP5 
promoter 

While the GC box between nucleotides 24 land 232 is essential for 
30 transcriptional activity, additional sequences downstream of the GC box might also 
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be required for transcription. To determine if such sequences existed, we created 30 
base pair internal deletions in the ~-271 construct downstream of the GC box. 
Constructs that had deletions in sequences between 240 and 180 nucleotides upstream 
of the FATP5 translational initiator had greatly reduced transcriptional activity in 
5 HepG2 cells (FIG. 105). To identify the specific sequences within this region 

required for FATP5 transcription, a 10 nucleotide linker (CTAACAGGAG) (SEQ ID 
NO.: 113) was exchanged for wild type sequence within the context of the -271 base 
pair construct (FIG. 106). Inadvertently, the 210 to 200 construct had a single 
nucleotide insertion and the 190 to 180 construct had a two nucleotide insertion 

10 relative to the wild type sequence. However, several other linker constructs that also 
had equivalent insertions (230 to 220 or 170 to 160 for example) had high levels of 
luciferase activity. Thus the decrease in luciferase activity in the 190 to 180 and 210 
to 200 constructs is due to changes in the nucleotide sequence and not the result of 
the nucleotide additions. Transfection of these DNA into HepG2 cells revealed two 

15 regions important for transcription. Mutating sequences between nucleotides -210 
and ~-200 or between nucleotides -190 and -180 drastically reduced luciferase 
activity (FIG. 106). 

In both humans and mice, FATP5 is only expressed in the liver. To determine 
the promoter elements mediating liver specific transcription, we isolated a BAC 

20 encoding the mouse FATP5 genomic locus and sequenced 10 kb upstream of the 
transcriptional start. Since this 10 kb of genomic DNA did not contain either a 
TATA box or GC rich regions found in TATA-less promoters, FATP5 may utilize 
non-canonical sequences for transcription initiation. Unfortunately, attempts to 
identify the transcriptional start using primer extension were unsuccessful, perhaps 

25 due to secondary structure in the 5' UTR. Since we did not unambiguously determine 
the transcriptional start site, the nucleotide numbering in all of the promoter 
constructs refers to the distance from the translational start codon. 



GC box and Spl transcription factors 

Since another gene was situated approximately 8 kb upstream of the FATP5 
30 initiator methionine, we hypothesized that promoter elements were likely within this 
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region of DNA. A luciferase reporter construct containing this sequence was 
transcriptionally active in two liver cell lines but was inactive in cell lines derived 
from lung, muscle, kidney, skin, or fibroblasts. Deletion analysis of the -8 kb reporter 
construct revealed that the FATP5 promoter was contained within the 261 nucleotides 
5 upstream of the initiator methionine. Promoter activity in this -261 base pair piece 
required the presence of a single GC box. Gel shift assays with oligonucleotides 
containing this GC box revealed the presence of three distinct complexes that 
required a functional GC box for binding. GC boxes bind the Spl family of 
transcription factors and the multiple complexes could reflect the binding of different 

10 members of the Spl protein family or different post-translational modifications of 
Spl inHepG2 cells (Rodenburg et al., 1997). Although the Spl family of 
transcription factors is widely expressed, Spl has been shown to be important for the 
transcription of several liver specific genes and is upregulated in liver after birth 
(Rodenburg et al., 1997). In some cases, Spl will facilitate the binding of a tissue 

15 specific transcription factor to DNA. For example, Spl binding to DNA enhances the 
binding of C/EBPp to an adjacent site in the liver specific CYP2D5 promoter (Lee et 
al., 1994). Since the C/EBPp binding site in the CYP2D5 promoter is suboptimal, 
C/EBPP binding to this site requires the presence of Spl or nuclear extract. A 
similar situation could occur in the FATP5 promoter. Although mutations in the 10 

20 nucleotides downstream of the GC box had no effect on luciferase activity, we did 
not test mutations immediately upstream of the GC box for effects on promoter 
activity. It is also possible that Spl might bind an unknown liver specific 
transcription factor and recruit it to the FATP5 promoter. Although, there is no 
experimental evidence for this, Spl has recently been shown to bind to a 

25 transcriptional activator so additional interacting proteins are possible (Ryu et al., 
1999). 

Other liver specific transcription factors 

Alternatively; since the Spl gene family is important for the transcription of 
many genes which are not liver specific, liver specific promoter elements in the 
30 FATP5 promoter might be located elsewhere (Boisclair et al., 1993; Rongnoparut et 
al., 1991; Sorensen and Wintersberger, 1999). Analysis of the sequence downstream 
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of the GC box using TFSearch 

(http://pdapl.trc.iwcp.orjp/research/db/TFSEARCH.html) did not reveal any 
additional transcription factor binding sites of relevance (Heinemeyer et al., 1999; 
Heinemeyer et ah, 1998). Further, we were unable to visually identify binding sites 
5 for known liver specific transcription factors in this sequence (De Simone and 
Cortese, 1992; Hanson and Reshef, 1997; Lai, 1992). Thus, we looked 
experimentally for additional promoter elements by mutating the sequence 
downstream of the GC box and identified two additional sites downstream of the GC 
box that were essential for FATP5 transcription. The sequences of these sites do not 
10 conform to any known transcription factor binding sites suggesting the either novel 
proteins bind these elements or that these elements bind known proteins in a novel 
manner. Preliminary gel shift data using oligonucleotides spanning these site 
suggests that these two elements may comprise a binding site for a single complex. 
Further additional data suggests that the complex which binds to these two sites 
1 5 interacts with the GC box 30 base pairs upstream. Interestingly, we noted a 

palindromic sequence equally split between these two sites (FIG. 107). Since many 
transcription factors bind palindromic DNA elements, it is intriguing to speculate that 
these two sequences contribute to the binding site for a novel transcription factor. 
Current investigations are focused on identifying the proteins binding to these novel 
20 elements and how this element interacts with the GC box. 

Several studies have shown that the FATP gene family is regulated by a 
variety of substances including LPS, cytokines, insulin, and diet (Frohnert et al., 
1999; Hui et al., 1998; Memon et al., 1999). Especially intriguing has been a recent 
report that FATP1 is upregulated by PPARa ligands in liver cell lines (Martin et al., 
25 1997; Motojima et al., 1998). Since fatty acids may be endogenous activators of 

PPAR's, transcriptional regulation of FATP1 by PPAR's may represent a physiologic 
feedback loop (Gottlicher et al., 1992; Grimaldi et al., 1999; Schoonjans et ah, 1996). 
Given that liver also expresses FATP5, it will be interesting to see whether this genes 
is also regulated by PPARa and the tools developed here should help address this 
30 question. 
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Several factors make the FATP5 promoter amenable to further study. First, 
liver specific transcription of FATP5 can be recapitulated using immortalized cell 
lines in vitro. Second, the minimal required promoter element that confers liver 
specific transcription is very small. Third, transcriptional activity of this promoter is 
5 very robust. Thus, further study of the FATP5 promoter may provide additional 

insight into the mechanisms of liver specific transcription and regulation of the FATP 
gene family. 

Example 18: 
Materials and Methods 

Polyclonal antibodies were raised against proteins containing the N-terminal 
domain of mouse FATP2 or the C-terminal domain of mouse FATP5 fused to 
glutathione-S-transferase (GS). Tissues for immunofluorescence were collected from 
8 week old mice and a 2 year old chimpanzee. Tissues were fresh frozen, cut on a 
cryostat and mounted on slides. Immunofluorescence was performed as previously 
described (Stahl et al., 1999). Pictures were taken on a Zeiss confocal microscope. 

To determine FATP2 expression in the gall bladder, mouse gall bladder was 
incubated with anti-FATP2 antibody as the primary antibody and rhodamine-labeled 
anti-rabbit IG as the secondary antibody. FATP2 antibody clearly stained the gall 
bladder epithelium, but did not result in significant staining of other cell types. 
20 (Figure 108) 

To further study FATP2 expression, chimpanzee liver was costained with 
anti-FATP2 antibody(green) and anti CD31 antibody(red). CD31 is expressed on 
endothelial cells and is used as a marker for blood vessels. FATP2 immunoreactivity 
was present in large patches which overlap with CD31 positive areas, suggesting that 
25 FATP2 protein was present in the space of Diss, the area where hepatocytes exchange 
nutrients with the blood. This implicates FATP2 in the uptake of fatty acids into 
hepatocytes. In addition to areas which overlap with CD31 immunoreactivity, 
FATP2 protein was also present on the cell surface of hepatocytes in a small bead 
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pattern. Immunoelectronmicroscopy of similar sections showed that FATP2 
immunoreactivity was localized in the walls of bile caniculi which are formed by the 
liver cells. (Figure 109) The presence of FATP2 in bile caniculi in the liver as well as 
its presence in the gall bladder epithelium suggests a role for FATP2 in either 
5 absorption or secretion of fatty acids into the bile. The levels of free fatty acids in the 
bile have been associated with the frequency of all stone formation. 

To further study FATP5 expression, chimpanzee liver was costained with 
anti-FATP5 antibody(green) and anti CD3 1 antibody(red). CD3 1 is expressed on 
endothelial cells and is used as a marker for blood vessels, FATP5 immunoreactivity 
10 was present in large patches which overlap with CD3 1 positive areas, suggesting that 
FATP5 protein was present in the space of Diss, the area where hepatocytes exchange 
nutrients with the blood. (Figure 1 10) This implicates FATP5 in the uptake of fatty 
acids into hepatocytes. 

Example 19 Identification and Characterization of Human FATP3 Proteins 

15 Isolation of additional humanFATP3 clones 

An additional clone encoding human FATP3 was identified by searching for 
sequences similar to murine or human FATP3 coding regions using the BlastX 
algorithm in a proprietary database, (Altschul, et al, J. Mol. Bio. 2T5: 403-410, 1990). 
One clone, which was identified by random library sequencing, is described as 

20 johni003f04 (SEQ ID NO: 116) extends the open reading frame of the hsFATP3 
polypeptide sequence by 30 amino acids at the N-terminus when compared to 
previously discovered sequences. The DNA sequence of this clone is shown in 
Figures 1 1 1 A and 1 1 IB, and the predicted protein sequence (SEQ ID NO: 1 1 7) is 
shown in Figure 112. The open reading frame of this clone begins at the initial 

25 nucleotide and includes nucleotide 2240. The first ATG is located at nucleotide 
number 51, resulting in a predicted protein which includes 730 amino acids. An 
FATP signature sequence (see Hirsch et al., PNAS, 95:8625-8629, 1998) is clearly 
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present between amino acids 331 and 640 of hsFATP3. Within this signature 
sequence hsFATP3 is 48% identical to hsFATPl at the amino acid level. A 
consensus AMP-binding motif has been identified (amino acid 333-334). Thus, 
hsFATP3 is clearly a member of the fatty acid family. 

5 Functional analysis of FATP3 Clones 

SEQ ID NO: 1 16 is contained in the mammalian expression vector pMET7 
(Tartaglia, et aL 9 Cell, 83: 1263-1271, 1995). To determine if the protein encoded by 
this DNA sequence can mediate fatty acid uptake, SEQ ID NO: 116 was transfected 
into COS cells. Uptake of a BODIPY-labeled fatty acid was determined as described 

10 in previous experiments (Hirsch, et aL, PNAS, 95: 8625-8629, 1998). Transfection 
with SEQ ID NO: 1 16 resulted in a dramatic increase in fatty acid uptake when 
compared to transfection with vector control. In this experiment, CD31 served as a 
marker for transfected cells. Only CD31 positive cells were considered for analysis 
(see Hirsch, et aL, PNAS, 95: 8625-8629, 1998 for details). The results (Figure 1 13) 

15 demonstrate that SEQ ID NO: 1 1 6 encodes a functional fatty acid transport protein. 

Tissue Distribution of human FATP3 

Polyclonal antibodies were raised by immunizing rabbits with GST fused to 
the most C-terminal 89 amino acids of mmFATP3 - 
(RPPQAL^VQLYSHVSENLP 

DPSVLSDPLYVLDQDIGAYLPLTPARYSALLSGDLRI) (SEQ ID NO: 120). 
Western blotting experiments with murine tissue lysates using the anti-FATP3 
antiserum closely confirmed the unique expression pattern of FATP3 as judged by 
northern blot experiments. This, together with the fact that the serum reacted only 
weakly with lysates from cell lines expressing either FATP1, -2, -4 or -5, indicates 
that the antibody recognizes preferentially FATP3, but not other FATP family 
members. 

FATP3 protein was detected in mouse liver, spleen, heart, kidney, testis, white 
adipose tissue, and most notably in the lung. Further FATP3 expression in the lung 
was examined by immunofluorescence microscopy. 5 to 10 nM thick fresh frozen 
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unfixed sections of murine and chimpanzee lungs were blocked with 10% FCS/1% 
donkey serum/1% BSA in HBS and incubated overnight with anti-FATP3 serum in 
blocking solution. After washing the sections Alex a 488 conjugated donkey anti- 
rabbit secondary antibodies were used to detect bound anti-FATP3 primary 
5 antibodies and nuclei were stained TOT03. In later experiments, chimpanzee lung 
was incubated with a mixture of rabbit anti-FATP3 and mouse monoclonal anti- 
CD31 to visualize FATP3 as well as blood vessels. Sections were imaged on a Zeiss 
LSM510 confocal microscope. Experiments carried out once with mouse and three 
times with chimpanzee lung tissue showed that FATP3 is present at high levels in 

10 type-II pneumocytes, a cell type responsible for secretion of surfactant, a 

phospholipid-rich film critical for lung function. The exact function of FAT3 in type 
II pneumocytes is not yet clear. One hypothesis is that FATP3 is responsible for 
supplying fatty acid substrates for the symthesis of surfactant. 

PCR-based experiments showed that the exocrine as well as endocrine 

15 pancreas expresses FATP3. This fact was confirmed by immunofluorescence 

performed as described above for the lung sections, on chimpanzee pancreas which 
showed FATP3 localized to the plasma membrane of acinar cells and a punctate 
expression pa'ttern on the plasma membrane and in the cytosol of alpha and beta cells 
of the pancreatic islands. The identification of a fatty acid transporter in the insulin 

20 producing cells of the pancreas has potentially broad implications for the treatment of 
type II diabetes and obesity. In both diseases, fatty acid levels in the blood are 
elevated and, in later stages of the disease, lead to diminished insulin secretion by the 
pancreas due to the induction of apoptosis in insulin-producing beta cells 
(Shimabukuro, et aL, PNAS, 95: 2498-2502, 1988). Blocking fatty acid uptake into 

25 the beta cells could possibly prevent apoptosis and maintain insulin secretion thus 
preventing the progression from obesity to diabetes. 



Example 20 Identification of a fatty acid binding domain in FATP4 

GST fusion proteins were constructed in pGEX for four regions of hsFATP4 
(SEQ ID NO: 52; Figure 51) which were generated by PCR and verified by 
30 sequencing. The first three fusion proteins were constructed from regions near the N- 
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terminal portion of the protein. SP1 (SEQ ID NO: 121) contained amino acid residues 
43-239 of the hsFATP4 sequence as shown in Figure 1 14A. This portion of hsFATP4 
contains a lipocalin domain (as shown in Figure 1 17) as well as a number of residues 
which in hsFATP4 are upstream of the lipocalin domain. SP2 (SEQ H) NO: 122) 
5 contained residues 43-290 of the hsFATP4 sequence as shown in Figure 1 14B. This 
portion of the hsFATP4 contains a lipocalin domain and an AMP binding domain as 
well as a number of residues which are upstream of the lipocalin domain. SP3 (SEQ 
ID NO: 123) contained amino acid residues 125-290 of the hsFATP4 sequence as 
shown in Figure 1 14C). This portion of the hsFATP4 contains a lipocalin domain and 

10 an AMP binding domain, but does not contain the upstream residues. The fourth 
fusion protein was constructed from a region at the C-terminal end of the hsFATP4 
polypeptide. SP5 contained amino acid residues 417-643 of hsFATP4 polypeptide as 
show in Figure 1 14D (SEQ ID NO: 124). 

Proteins were expressed in E. coli and purified on glutathione affinity beads 

15 using standard techniques. To determine fatty acid binding, beads were mixed with 
100 \XM 14C-labeled fatty acids in mixed micelles with taurocholate (lOmM, Sigma) 
and incubated for 30 minutes at room temperature. The beads were subsequently 
washed with PBS containing lOmM taurocholate and radioactivity associated with 
beads was assessed by scintillation counting. A fusion to the C-terminal domain of 

20 hsFATP4 (SP5) did not show any oleate (ARC) binding compared to GST protein 
alone, while 2 N-terminal fusions (SP1 and 2) bound significant amounts of oleate. 
(Figure 1 16). 



FATTY ACID 


SP1 


SP2 


SP3 


SP5 


GST 


Oleate 


25772±1326 


I6I72±1639 


4206±631 


2413±186 


1511*525 



25 Similar results were obtained using maltose-binding protein fusions. MBP 

fusion constructs were generated by digesting the pGEX-SP constructs with 
EcoRI/XhoI and ligated into pMAL digested with EcoRI/SalL MBP fusion proteins 
were expressed in E. coli and were purified under non-denaturing conditions following 
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the manufacturer's instructions. To determine fatty acid binding, beads were mixed 
with 100 \xM 14C-Iabeled fatty acids in mixed micelles with taurocholate (lOmM, 
Sigma) and incubated for 30 minutes at room temperature. The beads were 
subsequently washed with HBS containing lOmM taurocholate. The proteins were 
5 subsequently eluted from the resin with maltose and the amount of fatty acid binding 
to MBP-SP1, -2, -3, and -5 was assessed by determining the radioactivity associated 
with the elute by P-scintillation counting. 

Unlike GST fusion proteins, MBP fusion proteins are not self-dimerizing. 
Further, long-chain fatty acids (such as oleate and palmitate), but not short-chain fatty 

10 acids (such as butyrate), were specifically bound by SP1 (Figure 117). This selective 
binding is consistent with previous reports of the substrate specificity of FATP4 
(Stahl, et aL, Mol. Cell, 4, 299-308, 1999). The identification of a fatty acid binding 
domain in FATP4 will be useful in the development of small molecules that inhibit the 
binding and transport of fatty acids by FATP4 and may provide useful information on 

15 the mechanism of fatty acid transport. 



Results of Fatty Acid Binding 



FATTY ACID 


Composition 


binding to MBP-SP1 


binding to MBP-SP5 


Oleate 


C18H3402 


3968 


2800 


Palmitate 


C16H3202 


4588 


844 


Arachidonate 


C20H4002 


1942 


1147 


Butyrate 


C4H802 


142 


633 



These experiments demonstrate that the FATPs of the present invention 
contain domains that bind various long chain fatty acids. Thus, polypeptides 
containing these domains can be prepared and utilized to assess the modulation of 
25 binding and transport function by a variety of agents. The polypeptides with the 

highest binding capacities were shown to be those containing a lipocalin domain (such 
as those shown in Figure 118) with additional upstream residues, such as those 
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associated with this domain in the N-terminal portion of hsFATP4. Polypeptides 
containing domains in addition to the lipocalin domain (for example, those containing 
an AMP binding domain) were also shown to bind fatty acids at significant levels. 

Figure 118 contains an alignment depicting the consensus sequences for the six 
5 human FATP, hsFATPl, hsFATP2, hsFATP3, hsFATP4, hsFATPS and hsFATP6 
polypeptides. A lipocalin domain and an AMP binding domain for each polypeptide 
are both identifed and compared. A search using the lipocalin signature sequence 
[DENG]-X-[DENQGSTAJ^]-X(0,2)-[DENQARK]-[LIVFY]-{CP}-G-{C}-W- 
[FYWLRH-X]-[LIVMTA] conducted on a public database ( www.ebi.ac.uk/interpro/) . 
10 indicated that the lipocalin domains of hsFATPl and hsFATP4 are identical to the 
lipocalin signature sequence. In addition, a search directed to identifying sequences 
having at least 80% identity to the lipocalin signature sequence identified three 
additional human FATPs, hsFATP3, hsFATPS and hsFATP6. 

15 The following is the result of comparing individual hsFATP protein sequences 

with the lipocalin domain identified for hsFATPl and hsFATP4. The comparison was 
made using the BLAST Network Service at the National Center for Biotechnology 
Information. (Capitalized AA agree with the lipocalin signature sequence.) 

FATP6: 1 14 to 125 NEpDFVhVWFGL. 76% similarity (SEQ ID NO: 138) 

20 AATGAGCCGGACTTCGTTCACGTGTGGTTCGGCCTC 

FATP5: 182 to 194 sQAVpaLcMWLGL. 53% similarity (SEQ ID NO: 139) 

TCCCAGGCCGTTCCAGCCCTGTGTATGTGGCTGGGGCTG 



FATP4: 134 to 146 ENRNEFVGLWLGM. Identity (SEQ ID NO: 129) 

GAGAACCGCAATGAGTTCGTGGGCCTATGGCTGGGCATG 
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FATP3: 221 to 234 lPAGPEFLwLWFGL. 69% similarity (SEQ ID NO: 140) 

CTCCCCGCTGGCCCAGAGTTTCTGTGGCTCTGGTTCGGGCTG 

FATP2: 1 12 to 124 GNEPAYVwLWLGL. 80% similarity (SEQ ID NO: 127) 

GGTAACGAGCCGGCCTACGTGTGGCTGTGGCTGGGGCTG 

5 FATP1 : 1 36 to 1 48 EGRPEFVGLWLGL. Identity (SEQ ID NO: 126) 

GAGGGCCGGCCGGAGTTCGTGGGGCTGTGGCTGGGCCTG 
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All references cited herein are incorporated by reference in their entirety. 

While this invention has been particularly shown and described with references 
to preferred embodiments thereof, it will be understood by those skilled in the art that 
20 various changes in form and details may be made therein without departing from the 
spirit and scope of the invention as defined by the appended claims. 
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CLAIMS 



What is claimed is: 



1. An isolated nucleic acid comprising the nucleotide sequence of SEQ ID 
NO.:l 16 or its complement. 

2. An isolated nucleic acid comprising the coding sequence of SEQ ID 
NO.: 116. 

3. An isolated nucleic acid which encodes a polypeptide comprising the 
amino acid sequence of SEQ ID NO.:l 17 or its complement. 



10 



4. An isolated nucleic acid which hybridizes under stringency conditions 
of 6X SSC at 65° C, followed by at least two washes in 0.2X SSC/0.5% 
SDS at 65° C, to the nucleic acid comprising the nucleotide sequence 
of SEQ ID NO. :1 16. 



15 



5. An isolated nucleic acid consisting of a nucleotide sequence having at 
least 95% identity to a nucleotide sequence of Claim 1. 



6. An isolated nucleic acid consisting of a nucleotide sequence having at 
least 90% identity to a nucleotide sequence of Claim 1 . 



20 



An isolated nucleic acid encoding a fusion polypeptide, wherein the 
isolated nucleic acid comprises a nucleotide sequence of SEQ ID 
NO.:116. 
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8. A vector comprising a nucleic acid of Claim 1. 

9. A vector comprising a nucleic acid of Claim 2. 

10. A vector comprising a nucleic acid of Claim 3. 

11. A vector comprising a nucleic acid of Claim 4. 
5 12. A vector comprising a nucleic acid of Claim 5. 

13. A vector comprising a nucleic acid of Claim 6. 

14. A vector comprising a nucleic acid of Claim 7. 

1 5. An isolated host cell transfected with the vector of Claim 8. 

16. An isolated host cell transfected with the vector of Claim 9. 
10 17. An isolated host cell transfected with the vector of Claim 10. 

18. An isolated host cell transfected with the vector of Claim 1 1 . 

19. An isolated host cell transfected with the vector of Claim 12. 



0121795A2 J > 



WO 01/21795 



PCT/US00/25891 



-117- 



20. An isolated host cell transfected with the vector of Claim 13. 



21. An isolated host cell transfected with the vector of Claim 14. 

22. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 15 under conditions in which the nucleic acid is 

5 expressed, thereby producing the polypeptide. 



23. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 16 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 



24. A method of producing a polypeptide comprising the step of culturing 
10 the host cell of Claim 17 under conditions in which the nucleic acid is 

expressed, thereby producing the polypeptide. 



25. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 18 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 

15 26. A method of producing a polypeptide comprising the step of culturing 

the host cell of Claim 19 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 



27. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 20 under conditions in which the nucleic acid is 
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expressed, thereby producing the polypeptide. 

28. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 21 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 

5 29. An isolated nucleic acid comprising at least 30 contiguous nucleotides 

of the nucleotide sequence of SEQ ID NO.: 116. 

30. An isolated nucleic acid comprising at least 200 contiguous nucleotides 
of the nucleotide sequence of SEQ ID NO.: 116. 

31. An isolated polypeptide comprising the amino acid sequence of SEQ 
10 IDNO.:117. 

32. An isolated naturally occurring allelic variant of a polypeptide 
consisting of the amino acid sequence of Claim 31. 

33. An isolated polypeptide consisting of an amino acid sequence having at 
least 95% identity to the amino acid sequence of Claim 31. 

15 34. An isolated polypeptide consisting of an amino acid sequence having at 

least 90% identity to the amino acid sequence of Claim 31. 

35. An isolated polypeptide encoded by a nucleic acid that hybridizes to a 
nucleic acid consisting of the nucleotide sequence of SEQ ID NO.:l 17 



0121795A2J _> 



WO 01/21795 

36. 

5 37. 

38. 

10 39. 

40. 
41. 

15 

42. 

20 



PCT/US00/25891 

-119- 



under stringency conditions of 6X SSC at 65° C, followed by at least 
two washes in 0.2X SSC/0.5% SDS at 65° C. 



A fusion protein comprising a polypeptide consisting of the amino acid 
sequence of SEQ ID NO.: 117. 



The fusion protein of Claim 36, wherein the fusion protein transports 
fatty acids across a cell membrane or an artificial cell membrane 
system. 

An isolated polypeptide comprising at least 15 contiguous amino acid 
residues of SEQ ED NO. : 1 1 7. 

An isolated polypeptide comprising at least 50 contiguous amino acid 
residues of SEQ ID NO.:l 17. 



An isolated polypeptide comprising at least 360 contiguous amino acid 
residues of SEQ ED NO.:l 17. 



An isolated polypeptide comprising an amino acid sequence having at 
least 15 contiguous amino acid residues of SEQ ID NO.:l 17, wherein 
the isolated polypeptide transports fatty acids across a cell membrane or 
an artificial cell membrane. 



An isolated polypeptide encoded by a nucleic acid that hybridizes to a 
nucleic acid consisting of the nucleotide sequence of SEQ ID NO.:l 16 
under stringency conditions of 6X SSC at 65° C, followed by at least 
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two washes in 0.2X SSC/0.5% SDS at 65° C. 

A method for identifying an agent which binds to a protein comprising 
an amino acid sequence of SEQ ID NO.:l 17 comprising the steps of 
contacting the agent with the isolated protein under conditions 
appropriate for binding of the agent to the isolated protein, and 
detecting a resulting agent-protein complex. 

An agent identified by the method of Claim 43. 

A method for identifying an agent which is an inhibitor of fatty acid 
uptake by a protein encoded by a polynucleotide comprising a 
nucleotide sequence which encodes a protein consisting of the amino 
acid sequence of SEQ ID NO.:l 17, comprising the steps of: 

a) maintaining test cells expressing said polynucleotide in the 
presence of a fatty acid and an agent to be tested as an inhibitor 
of fatty acid uptake; 

b) measuring uptake of the fatty acid in the test cells; and 

c) comparing uptake of the fatty acid in the test cells with uptake 
of the fatty acid in suitable control cells; 

wherein lower uptake of the fatty acid in the test cells compared to 
uptake of the fatty acid in the control cells is indicative that the agent is 
an inhibitor of fatty acid uptake by said protein. 

An inhibitor of fatty acid uptake identified by the method of Claim 45. 
The method of Claim 45 further comprising the steps of: 
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a) administering the agent to one or more test animals; 

b) measuring exogenously supplied fatty acids in one or more 
samples of tissue or bodily fluid from said test animals; 

c) measuring exogenously supplied fatty acids in one or more 
comparable samples of tissue or bodily fluid from suitable 
control animals; 

d) comparing the fatty acids of b) with the fatty acids of c); 

whereby, lower fatty acids in step b) than in step c) is indicative that the 
agent is an inhibitor of said protein. 

An inhibitor of fatty acid uptake identified by the method of Claim 47. 

The method of Claim 45, wherein the nucleotide sequence which 
encodes a protein consists of a nucleotide sequence with 95% identity 
to a nucleotide sequence which encodes the polypeptide with SEQ ID 
NO.: 117. 

A method for identifying an agent which is an inhibitor of a protein 
encoded by a polynucleotide comprising a nucleotide sequence which 
encodes a protein comprising the amino acid sequence in SEQ ID 
NO.: 117 comprising the steps of: 

(a) introducing into host cells one or more vectors comprising a 
polynucleotide expressing said protein; 

(b) culturing a first aliquot of the host cells with fatty acid substrate 
of said protein and with an agent being tested as an inhibitor of 
said protein; 

(c) culturing a second aliquot of the host cells with fatty acid 
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substrate of said protein; 

(d) measuring, in the first and second aliquots, uptake of the fatty 
acid substrate of the host cells; 

wherein less uptake of the fatty acid substrate in the first aliquot 
5 compared to the second aliquot is indicative that the agent is an 

inhibitor of said protein. 

51. An inhibitor of fatty acid uptake identified by the method of Claim 50. 



52. The method of Claim 50 further comprising the steps of: 

a) administering the agent to one or more test animals; 

\ o b) measuring exogenously supplied fatty acids in one or more 

samples of tissue or bodily fluid from suitable control animals; 

c) measuring exogenously supplied fatty acids in one or more 
comparable samples of tissue or bodily fluid from the test 
animals; and 

1 5 d) comparing the fatty acids of the control animals with the fatty 

acids of the test animals whereby, lower fatty acids in the 
control animals than in the test animals is indicative that the 
agent is an inhibitor of said protein. 



53. A method for identifying an agent which binds to a protein comprising 
20 an amino acid sequence of SEQ ID NO.: 117 comprising the steps of 

contacting the agent with the isolated protein under conditions 
appropriate for binding of the agent to the isolated protein, and 
detecting a resulting agent-protein complex. 
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A method for identifying an agent which inhibits interaction between 
an isolated protein comprising an amino acid sequence of SEQ ID 
NO.:l 17, and further comprising a ligand of said protein, comprising: 

(a) combining: 

(1) said isolated protein; 

(2) the ligand of said protein; and 

(3) a candidate agent to be assessed for its ability to inhibit 
interaction between said protein of (1) and the ligand of 
(2), under conditions appropriate for interaction 
between the said protein of (1) and the ligand of (2); 

(b) determining the extent to which said protein of (1) and the 
ligand of (2) interact; and 

(c) comparing the extent determined in (b) with the extent to which 
interaction of said protein of (1) and the ligand of (2) occurs in 
the absence of the candidate agent to be assessed and under the 
same conditions appropriate for interaction of said protein of 
(1) with the ligand of (2); 

wherein if the extent to which interaction of said protein of (1) and the 
ligand of (2) occurs is less in the presence of the candidate agent than 
in the absence of the candidate agent, the candidate agent is an agent 
which inhibits interaction between said protein and the ligand of said 
protein. 

A method for detecting, in a sample of cells, a nucleic acid molecule 
consisting of a nucleotide sequence with at least 90% sequence identity 
to SEQ ID NO.: 116, comprising: 

purifying nucleic acid from the cells; 

hybridizing 1) purified nucleic acid from the cells to 2) purified nucleic 
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acid comprising SEQ ID NO.:l 16, under conditions that allow 
hybridization between 1) and 2) if the sequences of 1) and 2) have at 
least 90% sequence identity; and 

c) detecting resulting hybrid nucleic acids in the hybridization; wherein, if 
5 hybrid nucleic acids are detected at a significant level compared to a 

suitable control hybridization, then a nucleic acid molecule comprising 
at least 90% sequence identity to SEQ ID NO: 1 16, has been detected. 



56. A method for identifying (1) nucleic acid molecules in fixed cells 

which specifically interact with a (2) nucleic acid molecule comprising 
10 the nucleotide sequence in SEQ ID NO.: 116, said method comprising 

the steps of: 

a) adding to the fixed cells the nucleic acid molecule comprising a 
nucleotide sequence in SEQ ID NO.:l 16; 

b) incubating the fixed cells under conditions allowing 
15 hybridization of (1) with (2); 

c) removing the nucleic acid molecule of step a) that has not 
hybridized; and 

d) detecting hybrid molecules comprising (1) and (2). 



57. A method for detecting FATP3 in a sample of cells, comprising the 
20 steps of adding an agent that specifically binds to FATP3 to the 

sample, and detecting the agent specifically bound to the FATP3. 



58. 



The method of Claim 57 wherein the agent is an antibody which 
specifically binds to FATP3. 



WO 01/21795 PCT/US00/25891 

-125- 

59. A method for detecting FATP3 in a sample of cell lysate, comprising 
the steps of adding an agent that specifically binds to FATP3 to the 
sample, and detecting agent specifically bound to the FATP3 . 

60. The method of Claim 59 wherein the agent is an antibody which 
5 specifically binds to FATP3. 

61. An isolated antibody which binds to a polypeptide having an amino 
acid sequence consisting of at least 95% amino acid sequence identity 
with the amino acid sequence of SEQ ED NO.: 1 1 7. 

62. An isolated antibody which binds to a fatty acid transport protein 
10 having the amino acid sequence of SEQ ED NO.: 1 17. 

63. A method for detecting, in a sample of cells, a nucleic acid molecule 
comprising at least 90% sequence identity to SEQ ID NO.:l 16 * 
comprising: 

a) purifying nucleic acid from the cells; 

15 b) hybridizing 1) purified nucleic acid from the cells to 2) purified 

nucleic acid comprising SEQ ID NO.: 1 16 under conditions that 
allow hybridization between 1) and 2) if the sequences of 1) 
and 2) have at least 90% sequence identity; and 

c) detecting resulting hybrid nucleic acids in the hybridization; 
20 wherein, if hybrid nucleic acids are detected at a significant 

level compared to a suitable control hybridization, then a 
nucleic acid molecule having at least 90% sequence identity to 
SEQ ED NO.:l 16 has been detected. 
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64. A method for detecting, in a sample of purified nucleic acid, a nucleic 
acid molecule comprising at least 90% sequence identity to SEQ ID 
NO.: 116 comprising: 

a) hybridizing 1) the sample of purified nucleic acid to 2) purified 
5 nucleic acid comprising SEQ ID NO.:l 16 under conditions that 

allow hybridization between 1) and 2) if the sequences of 1) 
and 2) have at least 90% sequence identity; and 

b) detecting resulting hybrid nucleic acids in the hybridization; 
wherein, if hybrid nucleic acids are detected at a significant 

10 level compared to a suitable control hybridization, then a 

nucleic acid molecule having at least 90% sequence identity to 
SEQ ED NO.:l 16 has been detected. 

65. A method for detecting FATP3 in a sample of cells, comprising the 
steps of adding an agent that specifically binds to FATP3 to the 

1 5 sample, and detecting agent specifically bound to the FATP3. 



66. The method of Claim 65 wherein the agent is an antibody which binds 
to FATP3. 



67. A vector comprising a FATP regulatory sequence and at least one 

targeting sequence directed to the regulatory region of a nucleic acid 
20 with a nucleotide sequence selected from the group consisting of: 

a) SEQDDNO.:46 

b) SEQIDNO.:48 

c) SEQDDNO.:116 

d) SEQIDNO.:52 
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e) SEQIDNO.:54and 

f) SEQIDNO.:56 



68. An isolated host cell transfected with a vector of Claim 67. 



69. A method of producing a polypeptide comprising culturing the host 
5 cell of Claim 68 under conditions in which the nucleic acid is 

expressed, thereby producing the polypeptide. 



70. An isolated nucleic acid comprising a nucleotide sequence encoding a 
functional portion or fragment of a FATP polypeptide comprising a 
lipocalin domain. ' 



10 71 . The isolated nucleic acid of Claim 70 further comprising a nucleotide 

sequence encoding upstream amino acid residues. 



72. An isolated nucleic acid comprising a nucleotide sequence encoding a 
portion or fragment of a FATP protein containing a lipocalin domain, 
wherein the nucleotide sequence is selected from the group consisting 
15 of portions or fragments of: 



a) 


SEQ ID NO.:46 


b) 


SEQ ID NO.:48 


c) 


SEQIDNO.:116 


d) 


SEQ ID NO.:52 


e) 


SEQIDNO.:54 and 


0 


SEQ ID NO.:56. 
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73. An isolated nucleic acid of Claim 72 further comprising at least about 
90 nucleotides of the sequence upstream of the lipocalin domain. 

74. A vector comprising a nucleic acid of Claim 73. 

75. An isolated host cell comprising the vector of Claim 74. 

76. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 75 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 

77. A functional portion or fragment of a FATP polypeptide comprising a 
lipocalin domain. 



10 78. The FATP polypeptide of Claim 77 further comprising upstream amino 

acid residues. 



79, An isolated polypeptide comprising an amino acid sequence containing 
a FATP lipocalin domain, wherein the amino acid sequence is selected 
from the group consisting of portions or fragments of: 



15 


a) 


SEQ1DN0.:47; 




b) 


SEQIDNO.:49; 




c) 


SEQIDNO.:117; 




d) 


SEQIDNO.:53; 




e) 


SEQIDNO.:55;and 


20 


f) 


SEQ ID NO..-57. 
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80. A functional portion or fragment of a FATP polypeptide comprising an 
amino acid sequence selected from the group consisting of: 



a) 


SEQ ID NO.: 126; 


b) 


SEQIDNO.:127; 


c) 


SEQ ED NO.: 128; 


d) 


SEQ ID NO.: 129; 


e) 


SEQ ED NO.: 130; 


0 


SEQEDNO.:131. 



and 



10 



81. A fusion protein comprising a polypeptide consisting of a FATP 
polypeptide containing a lipocalin domain. 



82. The fusion protein of Claim 81 further comprising upstream sequences. 

83. The fusion protein of Claim 82, wherein the upstream sequences 
comprise at least about 30 amino acid residues of an upstream 
sequence. 



15 



20 



84. A fusion protein comprising a polypeptide consisting of a FATP 

polypeptide containing a lipocalin domain, wherein the polypeptide 
consists of an amino acid sequence selected from the group consisting 
of portions or fragments of: 

a) SEQEDNO.:47; 

b) SEQIDNO.:49; 

c) SEQIDNO.:117; 

d) SEQIDNO.:53; 
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e) SEQDDNO.:55; and 

f) SEQIDNO.:57. 



85. The fusion protein of Claim 84 further comprising upstream sequences. 



86. A method for identifying an agent which binds to a polypeptide, 
5 wherein the polypeptide comprises a FATP lipocalin domain, 

comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 



87. The agent identified by the method of Claim 86. 



10 88. A method for identifying an agent which binds to a polypeptide, 

wherein the polypeptide comprises a FATP lipocalin domain and about 
30 amino acid residues of an upstream sequence, comprising the steps 
of contacting the agent with the polypeptide under conditions 
appropriate for binding of the agent to the polypeptide, and detecting a 

1 5 resulting agent-polypeptide complex. 



89. The agent identified by the method of Claim 88. 



90. A method for identifying an agent which binds to a polypeptide, 
wherein the polypeptide comprises a FATP lipocalin domain and 
consists of an amino acid sequence selected from the group consisting 
20 of portions or fragments of: 

a) SEQ ED NO.:47; 
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b) SEQ ID NO.:49; 

c) SEQ ID NO. .117; 

d) SEQ ID NO.:53; 

e) SEQIDNO.:55;and 

f) SEQ ID NO.:57, 

comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 

An agent identified by the method of Claim 90. 

A method for identifying an agent which binds to a polypeptide, 
wherein the polypeptide comprises an amino acid sequence selected 
from the group consisting of: 

a) SEQ ID NO.: 126; 

b) SEQ ID NO.: 127; 

c) SEQ ID NO.: 128; 

d) SEQ ID NO.: 129; 

e) SEQ ID NO.: 130; and 

f) SEQIDNO.:131, 

comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 

An agent identified by the method of Claim 92. 
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94. A method for identifying an agent which binds to a polypeptide 
comprising a FATP lipocalin domain, wherein the polypeptide is 
encoded by a nucleotide sequence consisting of portions or fragments 





of: 




5 


a) 


SEQ ID NO.:46; 




b) 


SEQ ID NO.:48; 




c) 


SEQ ID NO.: 11 6; 




d) 


SEQ ID NO.:52; 




c) 


SEQ ID NO.:54; and 


10 


f) 


SEQ ID NO.:56. 



comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 

95. An agent identified by the method of Claim 94. 

96. A method for identifying an agent which binds to a polypeptide 
comprising a FATP lipocalin domain and upstream sequences, wherein 
the polypeptide is encoded by a nucleotide sequence consisting of 
portions or fragments of: 

a. SEQIDNO.:46; 

b. SEQ ID NO.:48; 

c. SEQIDNO.:116; 

d. SEQ ID NO.:52; 

e. SEQ ID NO.:54; and 

f. SEQ ID NO.:56. 
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comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 

97. An agent identified by the method of Claim 96. 

5 98. An isolated nucleic acid sequence comprising the nucleic acid 

sequence of SEQ ID NO: 113. 

99. The portion of the isolated nucleic acid sequence of Claim 98 which 
encodes a FATP regulatory protein. 

100. The portion of the isolated nucleic acid sequence of Claim 98 which 
10 encodes a FATP5 promoter. 

101. A method of identifying an agent which alters the level of expression 
of the nucleic acid encoding an FATP protein comprising: 

determining a base level of expression of the nucleic acid encoding the 
FATP protein; 

15 (b) contacting an agent with an isolated nucleic acid containing the 

coding region of the FATP protein under functional control of 
its promoter under conditions suitable for binding of the agent 
to the promoter; 

(c) maintaining agent-promoter binding during expression of the 
20 FATP protein; and 

(d) comparing the level of expression of the agent bound promoter 
to that of the baseline level of expression, 
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whereby, if the level of expression of the agent bound promoter is 
significantly different from that of the baseline level of expression, 
then an agent which alters the level of expression of the nucleic acid 
encoding the FATP protein has been identified. 

5 102. The method of Claim 101, wherein the FATP protein is FATP2. 

103. The method of Claim 102, wherein the FATP2 is encoded by a nucleic 
acid comprising the nucleotide sequence of SEQ ID NO: 48. 

104. The method of Claim 102, wherein the FATP2 comprises the amino 
acid sequence of SEQ ID NO: 49. 

10 105. The method of Claim 102, wherein expression is inhibited. 

106. The method of Claim 102, wherein expression is promoted. 

107. The method of Claim 101, wherein the FATP protein is FATP5. 

108. The method of Claim 107, wherein the FATP5 is encoded by a nucleic 
acid comprising the nucleotide sequence of SEQ ID NO: 54. 

15 109. The method of Claim 107, wherein the FATP5 comprises the amino 

acid sequence of SEQ DD NO: 55. 

1 10. The method of Claim 107, wherein expression is inhibited. 
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111. The method of Claim 1 07, wherein expression is promoted. 

112. A method for directing an agent to liver cells in a mammal, comprising 
administering to the mammal a complex which comprises the agent 
and a moiety which binds to FATP2. 



5 113. The method of Claim 1 12, wherein the agent alters fatty acid uptake in 

liver cells. 

1 14. The method of Claim 112, wherein the agent alters the level of fatty 
acids in bile. 

115. A method for directing an agent to the gall bladder in a mammal, 

10 comprising administering to the mammal a complex which comprises 

the agent and a moiety which binds to FATP2. 

1 16. The method of Claim 115, wherein the agent alters the level of fatty 
acids in bile. 

117. A method for directing an agent to the liver in a mammal, comprising 
15 administering to the mammal a complex which comprises the 

substance and a moiety which binds to FATP5. 

118. The method of Claim 1 1 7, wherein the agent alters the uptake of fatty 
acids in liver cells. 
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119. The use of an isolated nucleic acid comprising the nucleotide sequence 
of SEQ ID NO.: 1 16 or its complement in the manufacture of a 
medicament. 



120. The use of an isolated polypeptide comprising the amino acid sequence 
of SEQ ID NO.:l 17 in the manufacture of a medicament. 



121. The use of an agent which is an inhibitor of fatty acid uptake of a 
protein with the amino acid sequence of SEQ ID NO.:l 17 in the 
manufacture of a medicament. 



y 122. The use of an isolated nucleic acid comprising a nucleotide sequence 
10 encoding a portion or -fragment of a FATP protein containing a 

lipocalin domain in the manufacture of a medicament, wherein the 
nucleotide sequence is selected from the group consisting of portions 
or fragments of: 



15 



a) 


SEQIDNO.:46 


b) 


SEQIDNO.:48 


c) 


SEQIDNO.:116 


d) 


SEQIDNO.:52 


e) 


SEQ ID NO. :54 and 


f) 


SEQ ID NO.:56. 



20 123. The use of an isolated polypeptide comprising an amino acid sequence 

containing a FATP lipocalin domain in the manufacture of a 
medicament, wherein the amino acid sequence is selected from the 
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group consisting of portions or fragments of: 

a) SEQ1DN0.:47; 

b) SEQIDNO.:49; 

c) SEQIDNO.:117; 

d) SEQEDNO.:53; 

e) SEQIDNO.:55;and 

f) SEQIDNO.:57. 



124. The use of an isolated polypeptide in the manufacture of a medicament, 
the polypeptide comprising an amino acid sequence selected from the 
1 0 group consisting of: 



1. 


SEQIDNO.:126; 


2. 


SEQIDNO.:127; 


3. 


SEQEDNO.:128; 


4. 


SEQ1DN0.:129; 


5. 


SEQIDNO.:130; 


6. 


SEQIDNO.:131. 



125. The use of an isolated polypeptide in the manufacture of a medicament 
for treating obesity, the polypeptide comprising an amino acid 
sequence selected from the group consisting of: 



20 


7. 


SEQIDNO.:126; 




8. 


SEQIDNO.:127; 




9. 


SEQIDNO.:128; 




10. 


SEQIDNO.:129; 
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11. SEQ ED NO.: 130; and 

12. SEQIDNO.:131. 
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• - ■ ■ ' ■ ' • • 1 • • • ■ ' • ' • ' 1 • • • 1 •' • • ■ • ' • • f ■ • • • f 

G :< 7 G 3 G G F N 3 F ! - V Y F r = L V :< 7 M£ G 7i*!E _LR 3 A3GLG i ^AO 
FGGAGEFGLLVGG ; NOGOFLRRFDGY V 3E3 A ~3XX '. A.-3 V ^EG 
F5X3C3A v L3G:vlv^GELGVMYFRGF3GC 7FRWRGZNV 3 32C 

77E : /egvl3Fllgg"'/a7ygva7=gvegx^ 1 :>aa7a:ft 330 
sllgfna i yge-.gx vlaf yaff i flrllfgvg ttg 7fx ■; c 3gc 
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GGAA TICCA AAA A AAAA AAA TaCGaC 7ACACC 7GC 7CCGG 40 
AGCCCGCGGCGG7ACC 7GC AGCGGAGGAGC7C7G7C77CC £0 
CC7 7CA7C7CACGCGAGCCCGGCG7CCCGCCGCG7GCGCC 120 
CCGGC GC AGCCCGCC AG "CCGCCC GGAGCCCGCCC AG TCG 160 
CCGCGC7GC AC GCCCGGGG7G A AC CCTG7GCGC7CGC7GG 200 

210 220 230 240 

. ... i • * .... * . ... I .... 1 .... I ... . f 

GACAGAGGGCCCCGC AGCCGTCATGC t I7CCGCC a tc TAG 2"Q 

ACAGTCC7GGCGGGACTGC TGTTCCTGCCGCTCC TGGTGA 250 

ACC TC "GC 7GCCC A T A.C T TC TTCC A.GG AC A TA.GGC TA -77 320 

C 7TG A AGG 7GG CC GC C G 7G G GC CG G AGGG 7GCGC A GC 7A C 350 

GGGCAC-CGGCGGCCGGC GCGCACC A7CC~GCGGGCG7 7CC 400 

4 10 ii2C 430 440 

• • ■ ■ ' ' - ' ' 1 ■ ' • ' 1 • • ' • 1 • • • ' 1 * • • - 1 ■ • • ' ' • • * • f 

TGGA.GAAAGC3CGCC AG A.CGCCAC ACAAGCC 7777C 7GC 7 440 
C77CCGCGACGAGAC7C7CACC7ACGCGCAGG~GGACCGG 450 
CGCAGCAA7CAAG7GGCCCGGGCGC7GGACGACCACC7CG 520 
GCC7GCGCCA.GGGAGAC 7GCG7GGCGC7CC77A7GGG7AA fcO 
CGAGCCGGCC7ACG7G7GGC 7G 7GGC 7GGGGC 7GG 7GAA.G 500 

5*0 620 530 540 



C7GGGC 7G7GCCA 7G G C G 7 G C 0 7 C A A 7 7 A G A - " A TZZZZZ 5-0 
C G A A G ~C C C 7G G 7 GO A C 7 G C 7 7C C A G 7G Z 7 G C GG GG C G A A 550 
GG 7GC 7GC 7GG 7G 7C G C C A G A A C 7 AG A A GC AGG 7G 7 CGA A 720 
GAGA 7 AC TG C C A A G C C 7 7 A A A A A A G A 7GA 7G"G r CCA "C " 750 
A77^"G"GAGCAGAAC77C7AACAGAGA'GGGA77GAC7G 500 

c iO £20 530 3 40 

7 7 TC C 7GG A C A A A G 7 G G A TG A A G 7 A 7C A A C 7G A A CC " A 7 C 3 40 
CCAGA G 7C A r GG AGG TC 7GA AG ~C AC 77 7 7 7C 3 AC 7GC 7G '550 
CG77A7ACA'77A7AC77C7GGAACCACiGGTG"7C3^AA 920 
AGG A GG C A TG A TC AC 7G A TC A--GCGCA "A 7£G~A 7GG AAC 7 950 
GGCG 7CAC7777G7AAGCGGA 7 7G.AAGGC AGA "GA ~G"C A i 000 

10 iC 1C2G iGGC 1040 

1 1 1 ••• 1 •••• 1 •••• 1 . 

TC7A r A7CAC7C7GCCC7TTTACCACAG7GC7GCAC7AC7 i 0^0 
GA7 7GGCA77CACGGA"G7A7 7G7GGC7GG7GC7AC7C77 1050 
GCCr7GCGGAC7AAA777TCAGCCAGCCAG7777GGGA7G i i 20 
A C 7G C AG A AA A 7 A C A A C G 7 C A C 7G 7C A 77C AG7 A "A 7C GG i iSC 
TGAAC 7GC77CGG7A 7 77A7GCA AC 7CACCACAGAAACCA 1 200 
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1210 



1220 1 230 



12*0 



AA 7GACCG 7GA 7CA7A AAG7GAG AC 7GGCAC7GGGAAA7G 12^0 
GC 77ACGAGGA.GA 7G7G7GGAGACAA777G7CAAGAGA77 12S0 
TGGGGAC A 7A TGC A TC 7 A 7GA G 7 7C fA 7GC TGCC AC 7G AA 1320 
GGCAA7A 77GGA7 77A 7GAA.77A 7G CG AG AAA AG 7 7GG7G 1360 
C7G77GGAAGAG7AAAC 7ACC "AC AGAAAAAAA7CA7AAC \<\QQ 



G7CCGAGA7GAAA A7GGA7A7 7GCG7C AGAG77CCCAAAG 1 ^cG 

G7GAAG77GGAC7 7C7GG777GCAAAA7CACACAAC77AC 152C 

ACCA777AA7GGC 7A7GC7GGAGC AAAGGC7CAGACAGAG 156C 

AAGAAA AAAC 7GAGAGA7G 7C 7 77 A AGAAA.GGAGACC7C 7 tcOC 



A777CAACAG7GGAGA.7C7C77AA7GG77GACCA7GAAAA ic^C 
777CA7C7A777CCACGACAGAG7 7GGAGA7ACA77CCGG 16 EC 
7 G G A A A G G G G A A A A 7 G 7 G G C C A C C A C 7 G A A G 7 7 G C 7 G A 7 A i 720 
CAG r 7GGA.C 7GG77GA 777 7G 7CC AA 3AAG7AAA72 777A 1 TEC 
7GGAG7GCA7G7GCCAGA"CA7GAGGG7CGCA77GGCA"G icCC 
icIG i£20 ic30 lc*C 



GCC7CCA7CAAAA7GAAAGAAAACCA~GAA777GA7GGAA iE^C 
AG^AAC7C777CAGCACA77GC7GA77ACC7ACC7AG77A lEcC 
7GCAAGGCCCCGG7 77C 7AA.GA A 7ACAGGACACCA 77GAG 1920 
A7CAC 7GGAAC 7 77 7A A AC ACCGCAAAA 7GACCC 7GG7GG i 95C 
AGGAGGGC777aACCC7GC7G7CA7CAAAGA~GCC7 7G7A 2CCC 
2C;0 2C2C 2C3C 20 -C 



77 7 C 7 7GG A 7 G A C A 3 A 3 2 A A A A A ~ G ~ - ~ G r 3 3 3 7 A 73 A C 7 20 
GAGGAC A 7C 7 A 7AA 7 G C C A 7 A A G 7 G C 7 A A A A 3 3 C 7 G A A A 3 22 EG 
7C 7GAA7 - 77CCC AGG AGGA7 AA-2 7C AA GA 7 7 7C GAGA A. A 2 i 20 
GAAAC7GAA"GGaCAGCCAC7 7GA , :a"AA7C:aAC77 7AA 2i 50 
77 7G A 7 7 G A A. G A 7 7 G 7 G A GG A A A 7 7 7 7 G 7 A 3 G A A. A " 7 73C 2200 

22 10 222G 2220 22-G 

A7AC:CG"AAAGGGa3AG 777 7 7 7AAA"AAC^G77GAG": 22-0 
777GCAAG'AAAAAGA77 7AGAGA7 7a77A"7777:aG7G 22 EC 
7GC A Z 0 7A C 7 G 7 7 7 G 7 A 7 7 7G C -i A A C 7 G A 3 G ' 7 3 7 7 GG A 3 2220 
GGAAGGC A "A 7 7 7 7 7 7 A. AAA " A 0 7 7A G 7 A AA 7 7 A A A 7G A 2250 
AC 2252 
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KLSA 1 Y 7 V L A G L L F L F LL V N L C G F Y F F G 0 i G Y F L .'< V A A '/ G 40 
=RY=SYGGFRFAR7iLFAFLSXAFG7FHXFFLLFF.0£ri7 £0 
YAGVQRRSNGVARALKGKLGLRGGGO/ALL!*GN£FAYVWL 120 
WLGL V !< LGC A m A C LN Y iN i H A ;< 3 L L h" G F G G G G A ;< '7 L L V 3 F £ t €0 

lcaavei ;l=sl:<:<gc vs ■ yyvsftsmtcg ■: 3«.- : .ckvde 200 

210 220 230 240 

VSTEF [FE2V=S£V7FS7FALY • Y73G77GLF:<AA. V ? i 7HG 240 
R ;W v G7GL7FYSGL:<A3C 1 / i Y ■ 7LFFYHSAALL [ G • HGC i 2=0 
VAGA"LALF7<FSASGfWG0CF:<YNV7V iQY :GEL!_RYLG 320 
NSFCXFNGRGKX7RLALGMGLFGG V'.vHCF'/KFFGG ; C i Y£ 3cO 
FYAA7EGM • SFKN Y A SX V G A V G S V M YLQK !< I : 7YGL iXYG 400 

4;0 420 430 440 

VEKG£ = YF,G£NGYGV = 7 = XGE7G-L'/GX i 7.G»_r?F:' , iG v AG 440 

a:<ac~£ :< :< :< i .= v f:< :< : ■:■ l y - n s g g l l ?i v g .-. e.m f y fhc = 430 

V G G "F F. \v !<G£ N 7 A. 7 7 £ V A G " V G L V G F 7 0 £ V N '/ Y G - 7 - G H £ 20 
£GF iG>!A = ••;<^:<£MH£FGG'<:< : _FG:-i : AGY:_F = v a = F.= F_= 550 
I0G7 ! £ [ TG7F:<;-iR:<. M .7L7££GFNF A7 [ :<GA_YF : _GC Ta;< 600 

eiG c20 £30 5^C 
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AAG77C7CGGC7GG7CAG77C7GGGAAGA77GCCAGCAGC *Q 
■ ACAGGG7GACGG 7G 77CCAG7ACA77GGGGAGC7G7GCCG SO 
A7ACC77G7CAACCAGCCCCCGAGCAAGGCAGAACG7GGC 1 20 
CA7AAGG7CCGGC "GGCAG7GGGCAGCGGGC 7GCGCCCAG 150 
ATACC 7GGGAGCG7 7 7 7G 7GCGGCGC7 7CGGGCCCC 7GCA 200 

210 220 230 240 

■ - • • ' • • ' • ' • • • ■ ' • * • • ' • • - • ' • • • ; i .... f ... . f 
GG7GC 7GGAGACA 7A7GGAC 7GAC AGAGGGCAACG 7GGCC 240 
ACCA7C AAC 7ACAC A5G AC AGO GGGGCGC 7G7GGGGCG7G 250 
C77CC 7GGC 777ACAAGCA 7A7C7 7CCCC77C7CC77GA7 320 
7CGC7A7GA 7G 70 A 0 C AC A GG AG A G C C A A 7 7C GGG A Z Z C C 250 
CAGGGGCAC 7G7A 7GGCCACA 70 7CCAGG7GAGCCAGGGC -00 

4:0 420 430 440 



7GC7GG7GGCCCCGG7AAGCCAGCAG7CCC : :a77CC7GGG 440 
C7A7GC7GGCGGGCCA.GAGC 7GGCCC AGGGGAAG7 7GC 7 A '450 
AAGGA7G7C 7 7CCGGCC 7GGGGA. 7G77 7 7C77CAACAC7G £2C 
GGGACC7GC 7GG7C 7GCGA7GACC A.AGG7 7 77C7CCGC7 7 550 
CCA7GA7CG7AC 7GGA.GAC ACC 7 7C AGG7GGAAGGGGGAG 500 

5 10 £20 £50 c-C 

■ L • • ■ • * - * . • » . ... i .... t . . . ■ 

AA7G75GCCACAACC GAGG r GGG A. GAGG "C T ~GG- ZZZZ Z 540 
7 AG.- 7 7 7 7G 7 7CAGG.-GG 7GAACG ~C 7 A "GG - G7C.-G ~G ~ 330 
G C L A G : ^ : - A ; A^ — C A o o G - ) A A ■ G G G A G 7 I C 7 A G 7 7 / 20 
C7GCG r CCCCCCCAGGG 77 7GGAGC 7 7 a 'ZZaZZ 7C7ACA 750 
C0CA:G 7 G r C7GAGAA0 7 7GC0ACC77A7GCC:GGC:CCG 500 

ciG £20 550 3 4C 
• • - ■ f ■ • • ■ f ■ • • • 1 ■ ... * * .... ' . . . . r . . . . f 

A 77CC 7CAGGC 700A GGAG 70 7 77GG0CAGCACA.GAGACC 540 
77CA.AACAGCAGAA AG "CGGA 7GGC *AA 7GAGGGC 77C G £30 
ACG:CAGCA0CC"G r 0 7GA0CCAC7G T ACG" 7 C7GGACGA 920 
GGC r G T AGG7GC0 r A0G 7G00CC 7 Z ACAAC~GCGC GG~-G 950 
AGCGCCC7CC7GGCAGGAAACC77CGAA7C7GAGAAC77C iGCO 
10 iC 1C2C iCGC iC*C 

■ ■ • • 1 . ■■■ I ■.■ ■ r • • ■ . r . . ■ - ■ .... t ........ . I 

C AO A CO 7GA.GGG AGO 7GAGAGAGGA AC 7C 7G7GGGG r GGG iC^C 

GGCCG77GCAGG7G7AG7GGGC7G7CAGGGA7C7777;7A iOSO 

7ACCAGAAC7GCGG 70 AC 7A 7777G7AA7AAA7G 7GGC 7G I 120 

GAGG 7GA7CCAGC 7G 70 70 7GAC AAAAAAAAAA.AAAAAAA 1 ISO 
. AAAGGGCGGCOGC i 175 

n 4 • / o 
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JCF 3AGGF WEOCGGKS V7VFQ Y I GELCF YLVNOFFSKAEFG ^0 
KKVRLAVGSGLRFOrwSrtFVRSFGFLQVLSTYGLTEGiWA £0 
7 1 N Y 7GG £ G A V G RA 3 WL Y !< H I F F F5 L I F Y C Y 7 7GEF ISGP 120 
GGHCMA7SFGEFGLLVAF V SGGSrFLG YAGGPELAQGKLL 150 
KDVF nrGGVF r NTGGLLVCGGGGF LRF HGR7GGTr RWKGE 200 

2iC 220 230 2^0 

1 • ■ • • 1 ■ • ■ ■ r • • • ■ 1 * ' ' * 1 • • ■ • 1 • • • • 1 • ■ ' . • 1 



N V A 7 7 E V A E V F E A L Q r L G E V N V Y G V 7 V F G H E G R A G t\ A A L V 2^0 
L n F F H A L G L tl G L Y 7 ri V s E N L r r Y A F F F* F L F L G E S L A 7 7 E 7 2E0 
FXGCKV-tF.ANEGFDFS 7L 5QFLY VLGGAVGAYLFL77AF Y 320 
SALLAGNLF < 330 
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CGACCCACGCG7CCGGGCGGGCGGGGCCGGGCGGCGGGCG *G 
GGGC7GGCGGGGCGGCCGGGCCA 7GC AGGGCGCAGAG r CG £0 

GGC7AA7GCCCC 7CACGC7G7C 7ACGC7GC7GCAACCGGG 160 

CCGCA7C7GGACGGGGCGCCGCGCGGCGGAGCCGACGCCG 200 

210 220 220 2*0 
. 1 .'-.i 1 i .... i .... i 

GGCC ACAA7GC 7GC 77GGAGCC 7C7C 7GG7GGGGG7GC 7G 2*0 

C7G77C7CCAAGC73G7GC7GAAAC7GCCC7GGACCCAGG 250 

* TGGGA77C7CC-C7G77G77CC7C7AC77GGGA7C7GGCGG 32^ 
C 7GGCGC77CA 7CCGGG7C 7 70 A 7CAAGACCA7CAGGCGC 3cO 
GA 7A 7C 777GGCGGCC 7GG 7CC 7CC7GAAG G 7G A A G G C A A *00 

- :o *20 *cc **o 

• • ■ 1 • • - • 1 • ■ • • * .... i . . . . > .... t .... l .... i 

.AGG7GCGA 0- G 7GCC 7GC AGGAGCGGCGGAC AG 7GCCC AT *-0 
777GT77GCC7C7ACCG77GGGCGCCACCCCGACAAGACG -50 
•G0CC7GA7C77CGAGGGC AC AGA7A000AC7GGACC 7 7CC 520 
GCOAGC 7GGA 7G AG TAC 70 A. AGO AG 7G7AGCCAAC 7 7C0 7 £50 
GOAGGCCOGGGGCO 7GG00 70GGGCGA 7G7GGC 7GCCA7C SCO 

•c:G £20 £20 5*0 

• • • • f ■ i f . . . . f .... i .... ( 

• 7 7 C A 'GGAGAAOCGO A A. 7GAG 77CG7GGGCC 7a~GGC 7GG c-C 
GO A7GG0 GAA GO TO GO 70 7GGAGGC AGCCC 7CA"C AACAC 550 

' CAACC 7GCGGCGGGA TGC 70 TGC "CA.C 7G0C 7C id AC C 720 
7CGCGCGCAC GGGCCC 7 70 70 7 7 7GGC AGCGAAA 7GGCC ? 750 
CAGC0A7C7GrGAGO7O0A7GOCAGCC TGGACCCC "CGC 7 £00 

e;o soo sco s*o 
■ • - - ' • • • • 1 • • • • 1 • * • • 1 - ■ • ■ * * ' • • 1 • ■ • ■ 1 • • - • 1 

CAGCC7C7 70 7G0 70 7GG0 7 0 0 7 G G G A G C C C G G 7 G C G G 7 G £-0 
CO 7CCAAGC AC AG A AC A 00 7GGA0C0 70 7CC7GA A AGA7G £50 
C 7CCC AAGC ACO 770 CO AG 7 7G0OC7GAC A AGGGCT7C AC 920 
AG A "AA AC 7G 7 70 7 A OA 70 7 AC AC A TOCGGOACC AC AGGG 950 
C7GCCCAAGGC0GC0A70G70-G7GCACAGCAGG7a'7aCC iOCO 

10 iO 1 020 iOCO 1G^0 

1 1 •••• 1 f 1 ' 

G0A7GGC70CCC7GG7G7A0 7 A 7GGA 77CCGCA7GCGGCC 10*0 
CAACGACA 7CG 70 7 A 7GAC 7000 7CC0C0 7C 7ACCAC 70 A IG£0 
GO AGGAAACA 70G7GGGAA 7CGG00AG 7GCC7GC 7GCA7G 1 (20 
G0A7GACGG7GG7GA770GGAAGAAG77C7CAGCC7C0CG 1 ISO 
G 7 70 7GGGACG A 7 7G 7A 70 A A G TAG A AC 7G0 ACGA 7 "G 7G i 200 
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CAG7ACA77GG7GAAC7G7GCCGC7ACC7CC7GAACCAGC 12^0 
CACCGCGGGAGGCAGAAAACCAGCACCAGG77CGCA7GGC I2S0 
AC7AGGCAA;GGCC7CCGGCAG7CCA7C7GGACCAAC777 1320 
TCCASCCGC77CCACA7ACCCCAGG7GGC7GAG77C7ACG 1360 
C-GGCCACAGA.G7GCAAC7G7AGCC7GGGCAAC77CGACAG ]HQQ 

\'4\C 1^20 1^30 itiiiQ 
■ 1 '■■■.__<.■■.. i .... i ... . i 



CC A GG 7GGG GGCC7G7GG 77 7C A A 7AGCCGC A 7CC7G7CC t^-C 

77CG7G7ACCCCA.7CCGG77GG7ACG7G7CAACGAGGACA 1^30 

CCA7GGAGC7GA7CCGGGGGCCCGACGGCS7C7GCAT7CC 1520 

C7GCCAGCCAGG7GAGCCGGGCCAGC7GG7GGGCCGCA7C 1560 

A7CCAGA.AAGACCCCC7GCGCCGC77CGA7GGC7ACC7CA iSOO 

1S!0 1=20 153C |5!iC 

■ ■ ■ . i .... i .... i .. .. i .... i ■ 

A CC AGGGCGCCA AC A AC A AG A AGA77GCCAAGGA7G7C 7 7 IgtiG 

CAAGAAGGGGGACCAGGCC7aCC77aC7GG7GA7£-cc7G iScG 

G7GA7GGACGAGC7GGGC7A.CC7G7aC77C:GAGACCGCA ! 720 

C 7 G G G G A C A C G 7 7 C C G C 7 G G A A. A. G G 7 G A G A A C G 7 G 7 C C A C I 7cC 

CACCGA G G 7 G G A A G G Z A C A C 7 C A G C C G C C 7 G C 7 G G A C A 7 G ;cOO 

IciC i=20 1c3C Ic'AO 

.... r .... I .... i ... . i .... i .... i . ... i ... . i 



G C 7 G A C G 7 G G C C G 7 G ■ 7 A 7 G G 7 G 7 C G A G G 7 G C C A G G A A C C G 1 c ± 0 

AGGGCCGGGCCGGAA7GGC 7GC7G7GGCCA.GCCCCAC7GG IccG 

C A AC 7G7G AC C "GGAGCGC 7 7 7GC7CAGG7C 77GG AG AA.G i S 20 

G A A C 7 G C C C C 7 G 7 A 7 G C G C G C C C C A 7 C 7 7 C C 7 G C G C G 7 C C 1 S S 0 

7 G C C 7 G A G C 7 G C A C A A A A C A G G A A C C 7 A C A A G 7 7 ■; G A G A. A 20 C 0 

2010 20 20 2030 20 "C 

• • • • ! ■ - ■•■ ' ■ .. ..... i 



G A *» A ^ A , i .-^'i-.i.-'^li^GG- i ! i G A C C C G ZZ ~ - "7 G 7G 20 ^0 

aa agacccgc ~g 77c 7 a 7 c 7 aga 7gcccagaag:-gc;gc 7 2050 

ACC : C 1 ^ •- w .'A A G A. G G C C : AC AGCC Z Z A ~C Z AGGC 2 '. 20 

AGQC'cAGGA.GA AGC 7G 7GA 7 7CCZCCZ* TZZZ 7; 7GAGGG 2 icC 
CCGGCGGA 7GC 7GGA7C:GGAGGCCCAGG7 7CC:CCCCAG 2200 
22 !C 2220 222C 22'AO 

AGCGG'CC 7GGACAAGGCCAGACCAAAGC ^AGCAGGGCC" 22-0 
GGCACC"CCA'CC7GAGG r GC7GCCCC7CCA"CCAAAAC7 2250 
GCC A AG 7G AC 7CA 7 7GCC 7 7C CC A ACCC 77 ZZlZ A GGC 77 2220 
7C7G7GAAAG7C7CA7G7CCAAG77CCG r C7 7C7GGGC7G 2250 
GGCAGGCCC7C7GG7 7CCCAGGC7GAGAC r GACG:G7777 2 -CO 

2^:0 2^20 2^C0 2^0 

0 7C AGGA 7G A 7G 70 7 7GGG~GAGGG7aGGG AG AGGAC A AG 2'--G 
GGG7CACCGAGCCC 77000 AGAGA.GCAGGGAGC 7 7A7A A A 
7GGAACCAGAGCAGAAG r CCCCAGAC7CAGGAAG"CAACA 2520 
GAG7GGGCAGGGACAG 7GG 7 AGO A 7CC A 70 7GG7G GCC A A 2550 
AGAGAA 7CG "A GCC CO AG AGO 7GC CCA AG 77C AC 7GGGC 7 2500 

J 
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CCACCCCCACC7CCAGGAGGGGAGGAGAGGACC7GACATC 2540 
TG7AGG7GGCCCC7GA7GCCCCA7CTACAGCAGGAGGTCA 2660 
GGACCACGCCCCTGGCCTCTCCCCACTCCCCCATCCTCCT 2720 
CCC7,GGG7GGC7GCC7GA77A7CCC7CAGGCAGGGCC7C7 2760 
CAG7CC77G7GGG7C7G7G7CACC7CCA7C7CAG7C77GG 2S00 

25 10 2S20 2S30 2540 
.... r .... ( .... i .... f .... i .... i .... t ... . t 

CC7GGC7A7GAGGGGAGGAGGAA7GGGAGAGGGGGC7CAG 2840 
GGGCCAA7AAAC7C7GCC7TGAG7CC7CC.7AAAAAAAAAA 2820 
AAAAAAAAAAAAAAAAAAAAAAAAAAA 2SC7 
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^LLGASLVGVLLf 5KLVLKLFWTCVGF SLL: r LYLG5GGWF. 40 
F i R YF I K T i FRO i FGGLV'.LK VK AX VRCCLCERR 77= ; LF 6G 
A 3 7 V R 5? H F 0 K 7 A L I F £ G 7 0 7 H W 7 F R Q L 0 E Y 3 3 3 V A N F L G A I 20 
RGLA3GG YAA • FM£N R N £ F V G L'w'LGM A :< '_ G V E A AL ! N7NL IS 0 
R R 0 ALLH C L T T 3 R A R A L V FG 3 £ M A 3 A : C £ V H A S LQ F S L S L 200 

2i0 220 230 240 

• • - ■ » • - ■ ■ ' • • ■ ■ 1 • • ■ ■ 1 ■ ■ ■ • 1 • • ■ • 1 • • • • 1 ■ ■ ■ 1 
F C 3 G S W £ = G A ' I ? ? 3 7 £ H L 0 r L L X J A ? !< H L ? 3 C = 0 X G F7C'< 24 0 
LFY i Y72G77GU=;<AA i V VH S R Y Y XMAALY Y YGf Rf.RRNC 230 
[ VYOCLFLYKSAG.N t VG [ GGCLLHGMTVV i RKXF3 A5RFV 320 
GDC EXYNCT i VQY i GELCRYU_NGFFR£A£NCKGVR*ALC- 350 
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gtcgttc«atcctcgc<tc<;ttagatctcggac^cacctgtgttctgc<:ccccaag 
ttctctacttcctgcttctggga 

tgactgtcggcagcatggcgtgacagtgatcc7g7atgtgggcgagctcctgcgata 
c77g7g7aaca77ccccagcaac 

cagac<i-accc<:-acacatacagtccc<ctc<:^aitc<:<<a-\tc<^ 
gtc-tggggagaccttccagcagcg 
t77cc<r7ccta777cgga7c7ngggaagtct7acc<k^^ 
ggggctttagttcaaatattgtt 

ccccctttgagctggtgcagttcg 

aca7c<^aggcc^c^±c<c7g7gac<<%*caa7 

ctaggggagccggggctgctgttg 

accaack:-7GG7aac<:cagcaaccc77CG7C<:<:-c7accc^ 

c<}a.icc-ga.\gctgg7gcgca.\cg7 

C<CKZ<AA7CC<KK:GACG777aC7ACAACACCC<^^ 
A^GGC77CC7C7AC77CCGCGACC 

GAC7CC<«ACACC77CCGA7C<-AAC<<<GAGA\CG7C^^ 
GGCG7G77G7CGC AC-G7GGAC77C 

77C^AACAC<:-77AACG7G7a7GGCG7G7C^G7C<CaC<-77C-^^^ 
A7GGC7C-C7G7GGCA77AGCCCC 

CC-C<CAGAC777CGACC<<:<:v\GA.\C"G7ACCAC^ACG77CC^C^77GC-C7CCC7GG 
C7ACGG7ACCCCCCA777CA7CC 

C<A7CCAC<:-ACGCCA7C<-AGG7CACCAG<ACG77CA.iAC7GA7GA.iGACCGC<:-77G 
G7C^G7GAGGGCTrCA.-i-7G7GGGG 

A7CG7C<:-77GACCC7C7G777G7AC7C^ACA.\CCC<K:<CCAG7CCTICCGGCCCC7G 

AC GGC AG AAA7G7 AC C AGGC7 G7 

G7C~GAC<Z<^.CC7C<AJ:<:<n:C^ 

AAGCCAGCCACCCCCACCCCaACA 

CAC7CC<}7G7CCC777CA7CCrC<<:<C7G7G7GA. i .7CCCAGCC7GGCCA7ACCC7CA 
AC C7C AG7GGGC7C-GA.Ai 7GACA 

G7C^<CC7G7AC<AG7C^AGA.i7A.A^C7CAG^r7G^r-77CACAC-A.^A. 
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AACGGCAAG7AAGCGCAACGCAA77AA7G7GAG7AGC7CA 
C 7CA77AGGCACCCCAGGC 7 7 7 ACAC7 77A7GC 77CCGGG EC 
C7CG7A.7G77G7G7GGAA77G7GAGCGGA7ACCAA777CA 120 
CACAQQAAZCAGC 7 A 7GAGA 7GA77ACGA A777AA 7ACGA 160 
C7CAC7A7AGGGAA777GGCCC7CGAGGCCAAGAA77CGG 2C0 
2iC 220 230 240 

CACGAGGGG7GC7GAGCCCC7GCGCGG777CrGG7GCG7A 240 
GAGAC7G7AAA7CGC7GCGC7 7C7CAG7CA7CA7CA7CCC 230 
AGC7777CCCGGC7CGAA77CAGCC7CCAAC7:AAGCrCG 220 
CGGGAAAGAC 7ACC7GAGAGGAGAA.AAGC 77C7G7CCC 7G 360 
GACC 77C77C 7GAGGG 7GGAG7CGGAGGC 7CCC7GC 7 77C 400 
4(0 420 42C 440 

CAGCCGCCCAG7GACCCAAGC7 7AA7C7 7CAGCACCAC77 44C 
GGGGCGACC 77 77CGG7GC A A AGC7ACGA 7 7C'G77"C7C 460 
A GG A 7 7C 1 7C C CCA 7C C C GC 77CGCCCCGGAAA AGC FGAC £20 
AAGAAC77CAGG7G7AAGCCC 7GAG 7AG 7GA CG.- 7C 7GCG fcO 
G7C7CCG7GGAGAGC7G 7GC C 7GGA AG AG A AGGACGC F.GG £00 
ciO 620 cCC 640 



7GGGGGC7GAGA 7CAGAGC 7G "C 77C 7GGC IC-iG~7;CCC £40 
CCA7GC 77C7G7C A r GGC 7 A AG ^G7 7C 7AGGGGC ~GGAA " 5cO 
GG"CG"::7GCAG 77G7 7GC A GAAAC7CI 7:-7C:C 7 7i-: 720 
7 77 7GGGA "GAG 7 7C 7GG r 7GG 7G " 7GAAGG 7 GG r GC ~C - 760 
77A7AA '7ZGGG7GA AG A AG 7A"G A AAAG AC -GGGGAGC r £00 

SiC 620 6CC .^C 



GG7GAC "G 7GG 7GGA 7 A A A. 77C 7 7GAG 7C A 'GC C AAAAGA £ -C 
C A AGC 7CGGA A A CC 7 7 7CA 70A 7C 7A 7GAGGGAGACA 7C 7 ££0 
AC'-CZT ± 7CAGGA 7 Z ~ AZ A Z A A A A ZZ ZZ A Z Z AZAZ~ Z C C 920 
CCAOG7C 7 7CC 7GAACCA 7 7GC 7C7C 7GA ^.-AAGGGGGAC 960 
AC GG ; GGG • C t GG : G A ; G A GO A A . G A uC CGG A C 7 7C G 7 7C i OCC 

iCiC 1020 i CCC iC'-C 
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ACG7G7GG77CGGGC 7C GO C A AGC 7G GGC ~GCG7GG7GGC IC4C 
C777C7CAACAC0AACA77CGG"GGAAG7GC:7CC7GAA7 1060 
7GCA7CGGCGGC7G7GGGCOOAGAGCC07AG7GG7GGGCG i t 20 
CAGA777GG77GGAACGG 7AGAAGAAA 7CC77CCA AGGC 7 ! i 60 
CTZAZAAAATA 70 AG 7G 7 77GGGGGA7G A AAGA 7 7C7G 77 1200 

cr. <ll a 
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CCACAAGG7G7AAT7TCAC7CAAAGAAAAACTGAGCACCr 12*0 
CACC7GA7GAGCCCG7GCCACGCAGCCACCA7G77G7CTC 1250 
AC7CC7CAAG7C7AC7 7G7C77 7ACAT7777ACC7C7GGA P2Q 
ACAACAGG7C7ACCAAAAGCAGC7G7GA77AG7CAGC7GC 1360 
AGG7777AAGGGG 77C7GC 7G7CCFG7GGGC7777GG77G 1 4C0 
i ^ 10 1420 l^oC iiian 

' ' > I f | ~ 

7AC7GC7CA7GACA77G77 7A7A7AACCC77CC7C7G7A7 14^0 

CA7AG77CAGCAGC 7A 7CC7GGGAA777C 7GGA7G7G77G 1450 

AG77GGG7GCCAC77G i G7G ; 7AAAGAAGAAA7777CAGC 1520 

AAGCCAG77 7 7GGAG7GAC 7GC AAGAAG7A 7GA 7G7GAC 7 1 -SO 

G7G777CAG 7A7A 77GGAGA AC777G7CGC 7ACC 777GC A 1500 

16 10 1520 I £30 15^0 
i .... i .... i .... i 

AA.CAA7C 7AAGAGAGAAGGAGAAAAGGA7CA7AAGG 7GCG ! 5 4C 

777GGCAA77GGAAA7GGC A7ACGGAG7GA 7G7A 7GGAGA 1 550 

GAA 77 777A.GACAG A77 7GG AAA 7A.7AAA.GG7G7G 7G A AC I 720 

77 7A75CAGC7ACCGAA7CAAGCA7A7C777CA7GAAC7A 17=0 

CAC 7GGGAGAA 77GGAGC A A 7 7G GGA.GA AC AAA ~ 7 7G 77 7 \ 500 

ic ;C 1520 1520 15^ 
■ • • i .... i . ....... . i 

7ACAAA.C77C 7 7 7CC AC 7 7 7 7G AC 7 7 AA.7AAAG 7A 7G AC 7 ic40 

77CAC AA AG A. 7GA A CCC A 7G AG A AA7GAGCA.GGG 7 7GG7G iccO 

7A77CA7G7GAAAAAAGGAGAACC7GGAC7 7C7CA77 7C7 i 920 

C G A G 7 G A A 7 G C A. AAA A A. "CCC 7 7 C 7 7 7 G G C 7 A 7 G C 7 G G G C 1550 

C 77A 7AAGCA.C AC A. A A AGAC A A.A "7GC7 77G 7GA 7G 77 77 200 C 

20 !C 2020 20 CO 2040 - 



7AAGAA Z GG A GA r G T 7 7 A C I 7 7 - A 7 a C 7GGA GAC 7 ~ A A 7 A 20- -0 
G 7CCAGGA r C A 2GA.C A A *"7 7C Z 7 7 7 A 777 7 7 Z G G A C I G T - 2050 
C 7GGAGACAC 77 70 AG A "GG AAA OGA GAAAA7G ~CGC - - C 2 ! 20 ! 
OAC~GAGG7 7G0 7GA7G77A7 7GGAA7G'7GGA"7 7:i*A 2 150 
C AGGAAGCAAACG7C 7A 7GG 7G 7GGC 7a 7A 7CA GG 7 7 A 7G 2200 
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A AGGA AG AGO AGG A A 70 GO 7 70 7A77A~7 77AAAACCAAA 2240 
7AC A r C 7 77AGA" 7 7GGA A A A AG 777A 7GAACAAG7 7G7A 22=0 



ACA 77 ; C 7AC0 AGO 7 7 A "G 
7 7CAGGAAAAAA"GGAAGC 
GAAGCA "AG 7 7GG ~GGA A 



7 i G7CCACGA"7 r 77AA2AA 2220 
A ACAGGA ACA 77ZAA AC 7- 77 2250 
G A ~GGA 7 77 A A 7C I AC 7=2A AA 2*00 



24 iO 2^20 242C 2440 
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A 777C7GAAC0 AC 777AC 7 70A 7GGA 7AAC 7 7G AAA A AG 7 2440 
C "7A r G7 7C 7 AC 7GA00AGGGAAC777A7GA 7CA AA 7 A A. 7 2450 
G77AGGGGAA A 7A A A AC 7 7 7AAGA77777A7A7C7AGA AC 2520 
777CA7A"G0 777077AGGAAGAG7GAGAGGGGGG7A7A7 2550 
GA7 7C 7 77A 7GA AA 7GGGG A A AGGG A GC 7AACA 77AA 7 7A 2500 
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7GCA7G7AC7A7A777CC77AA7A7GAGAGA7AA777777 2640 
AA77GCA7AAGAA7777AA.777C7777AA77GA7A7AAAC 2660 
ATTAGT7GATTAT7CTTTT i ATCTATTTGGAGATTCAGTG 2720 
CATAAC7AAG7A77TTCC7TAATAC7AAAGA7TTTAAATA 2760 
A~A£A7AG7GGC7AGCGG777GGACAA7CAC7AAAAA7G7 2S0O 
28 iO 2S20 2S30 2S4Q 
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ACT77C7AA7AAG7AAAA777C7AA7777GAA7AAAAGA7 2S40 
7AAA7777AC7GAAAAAAAAAAAAAAAAAAAAAA77GGCG 25S0 
GCCGC 2SSS 
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AAG77CCCAC7CCAGAC7 7C 7GCGAGAACCCG7GAGGAAG «W 
CAGCGAGAACCGGGGG777GC A AGCCA.GAGAAGGA7GCGG SO 
AC7C CGGG AGO AGG A AC A GC C 7C 7G7GGCC 7C A 77GGGGC 120 
7GC7 77GGC77C7GGGAC77CCG7GGACC7GGAGCGCGGC ISO 
GGCGGC G 7 7CGG7G7G "AC G 7GGG7AGCGG7GGC7GGCGA 200 

210 220 220 2'4G 

777C7GCG7A7CG7C7GC AAGA.CGGCGAGGCGAGACC 7C 7 2-0 
T7GGGC7C7C7G77C7GA7CGGCG7GCGGC7AGAGC7ACG 230 
ACGACACCGGCGAGC A GGA G AC AC GA 7CCCACGCA "C "7C 520 
CAGGCCG7GGCCCAGCGA.C AGCCGGAGCGCC7GGCGC7GG 3S0 
7AGA7GCGAG7AGGGG7A7C 7GC 7 GG AC C 7 7CGC AC AGG 7 ^OC 

<M0 420 430 iiiic 

AGACACC7AC7CCAA7GG7G7GGCCAA7C7G7rC:7:CAG ^tO 
C7GGGC777GCGCCAGGCGA 7G 7GG 7GGC 7G7G 77CC 7GG -cO 
AAGGCCGGCCCGAG77CG7GGGAC~G7GGC7GGGCC r GGC 520 
CAAGGCCGG7G~AG7GGC7GCGC7 7C7CAA7G7CAACC7G 5cO 
AGGCGGGAGCCCC 7 7GCC 77C 7 G C 7 7 G G G C A C A 7 C A G C 7G SCO 

ciO £20 £30 c-G 



C - A A u - i - A ; t A . w w G •-: A A . ^ G A G G G G C G G ■ c -r*^. 
G Aum • c.-c : ^A^ : .- ; c^^u---- 3CC 7GC 7 IAAG £30 
77G 7GC 7C7GGAGA 7G "GGGGCC 7GA CAGCG~CC 7GCG 7 1 720 
^CACZZAZZTTZ 7GGACCCC A 7GC 7 7GC 7GAGGCGCG7.AC 730 
CACACCCC7GGCACAGGCCCC AGGC AAGGGCA 7 IGA 7GA r cGC 

5 10 £20 33C 3'~C 
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CGGC7A7777ACA7C7A7AC7 7C7GGGACCACCGGAC77C cAG 
C 7AAGGCGGCCA77G7GG 7GC Ai AGCAGG 7 AC 7A.CCGCA7 E3C 
CGCAGCC77CGGCC ACC A 7 7CC r ACAGCA 7GCGGGCCAAC S20 
GA7G7GG7C7A"GAC7GCG 7 AGG "C 7C 7ACCAC 7CAGC AG ScO 
GGAACA7CA"GGG0G 7GGGAGAG 7G"!"A7Ga 7C7ACGGG r 7 IOCC 

iOiC iC20 iOGC [040 



AACGG 7GG 7AC7GCGG A AG A AG 7 7C7CGGCCAGCCGC 77: 1040 

7GGGACGAC7G"G7G a A A 7 A 7 A A. 7 7 G G A C G G 7 A G 7GC A G 7 1 030 

ACA7CGG7GAAA7A7GCCGC7AG07GC7AAGGGAGGGGG7 I 120 

TCGCGA 7G7AGAGGGGCGGCACCGGG7GGGCG7GGGGG7G t ISO 

GG7AACGGAC 7GCGGCCAGCCA 7C 7GGGAGGAG77CACGC 1200 

fa- 56 A 
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AGGG777CGG7G7GCGACAGA77GGCGAG77C7ACGGCGC 12^0 

CACCGAA7GCAAC7GCAGCA7TGCCAACATGGACGGCAAG 1230 

G7CGGC7GC7GCGGG77CAACAGCCG7A7CC7CACGCA7G 1320 

TG7ACCCCA7CCG7CTGGTCAAGG7CAACGAGGACACGA7 1 360 

GGAGC:a.C7GA.GGGAG7CCCAAGGCC7C7GCA7CCCG7GC I40G 

14 !0 1420 1420 1440 
■ 1 . ... i ... . ) .... i ... , i .... i ... . i 

C.AGCCCGGGGAACC~GGGC~rC~CG?GGGCCAGArCAACC 1440 
AGCAAGACCC7C7GGGGCGC 77CGA.7GGC7A7G77AG7GA 1480 
CAGGGCCACCAACAAGAA.GA7 7GCCCACAGCG7G77GCGA 1520 
A.AGGGGGACAGCGGC 7ACG777CAGG7GACG7GG7AG7GA I £60 
7GGA.CGAGC7GGGG7ACA.7G7AC7 7GCG7GACCGCAGCGG IcGG 

1510 1=20 i£30 1340 
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GGA7AC-77CCGA7GGCGCGGCGAGAACG7A7CCACCACG ic40 

GAGG7GGA.AGGGG7GC7GA.GGGG0C7G7 7GGGCCAGACGG iccQ 

A 0 G 7 G G C 7 G 7 G 7 A 7 G G A G 7 G G 0 7 G ~ G C C A G G A G 7 G G A G G G 1720 

G A A A A G C G G C A 7 G G C G G C C A 7 7 G C A G A C C G C C A C A A C C A G 1730 

C 7GGACCC 7 A A C 7C A A. 7 G 7 A. C Z AG GA.A 7 7G C AGAAGGT7C i 8CG 

IciC 1320 5 53C 13 40 

77GCA7CC 7 A. 7 G Z C C A G C C C A 7C 7 7CC 7GCG7C 77 G 7C-CC i 340 
C G A A G 7 G G A 7 A C A A. C A G G C A G 0 7 7 C A A G A 7 C C A G A A C- A. C C i 3 6 0 
C G A C 7 A C A G C G 7 G A A G G C 7 7 7 G A C C C C C G C G A G A C C 7 C A. G i S 2 0 
ACCGGC7C77C~77C7A.GACC7GAAACAGC-GACGC7AC'C7 i960 
A C C C C 7 G G A 7 G A G A G A G 7 C C A 7 G •: C C G C A 7 C 7 G C G C A G G C 2 C C 0 

20:0 2020 2G3C 204-: 
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C i i : o G ^ C *^ ,= A A -*^C , ! G ' ■. C- 3G *G ~ 7 20c 0 

7C 7CC ~ G C G 7GGCC A GG 7G GCC A GG A GG A 3 2 7G "GGG ~GC 2 i 2G 
A GGA A AC : GGA A W C 1 GAG. G G GGGG 7 G "GGG 77 7G 2 7 A I 2 * SO 
AACCCAGCA7GCACACA~C7AGCC7C7GCC77GG r "G 77 7 7 2200 

22 iO 2220 2220 22*0 



7G7GGA7C 7G 7 7 7CG 7CCG 7GCCC A.GC AGGAGGGGCaC A G 22-0 
AGAC A r 7GGG "GO ~G ■ G = GG ! GG AG7GGGACCGG7 G7G "A 22SC 
GGGG : G „ : G C 7 G A G G C > - ■ G A C A G G G A C 7 G G ~ 2 G G G G 2 G 20 
C7CCC 77CCCC A 77G r GCC 7 7aGG 77CC 7C:±C 7G~GCGG 2230 
C G G 7 G A A G C A A G 7 G G G G A C C C A C A T A G G 7 G 7 7 G ~ C Z Z 7 G Z 2-00 

24 iC 2'420 2'-2C 2440 
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7GAGGG77GG7AGCAAA7GGAC0C70A7G70AGC7GGGAG 2440 
ACACA'GGAG7C7CCCA0 7GA00C00AA7CAAC7GAAGA7 24-30 
AC7G7777G7a77a77G7777GAGA~AGGG7C7:a0 7G7G 2520 
GAGGC:aAGC7GG00 7CAGGC70AOCAC7C7aC7GC0 7CC 2550 
GGGCACCAGOO 7G0 AG 7 77GA 7GA.0A 7G r A 7GGAC 7A 77G 2300 
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.2610 2620 2630 2640 

77C7AAGGG7C7TC7GAG7CCCTGC7T7CCCC7CA7G7CC 2640 
7AAAACC77CCAGAAC7GAC7C7GA7CAC77GGA7G7AGC 2630 
TAG7GTTGGCCCTGCCCACGTGTGTCAATTCAGGGGTCCC 2720 
CA.GGCA7CA7C7C7GGAGGCCC7AACC77GGCAAAGC77G 2760 
GATG7CC7CACA7CACAC-CAGGAGACCCAGGAAGG77GC7 2SQ0 

2610 2620 2S30 2S40 
.... t .... i .... f .... t .... i .... ) .... i .... i 

G7GG7G7C7C77GGGCACCCC7GGCGGCAGCCG7GGACA7 2540 
GC77CCC7GC7G7GA7AGCCCAAAC7G77GCC7A7GACA7 2880 
77GAGG7C7ACCC77C7GGC7GCCA7GG7CCCCA77GAGA 2S20 
7C777GG7GAC7CACC7CAGCCACCAAGCCAGGCC7C7GC 2560 
C77CC77CAGC7C7AAGGGJCA7GAAGGG7G7GGACAGAGC 3000 

3010 3020 3G30 3040 
.... i .... t .... i .... t .... t .... i .... i .... i ■ 

AGCCACAGGC7GCCCACAG7CACCCACA7GCAAGTG77A7 3040 
77CC77G777G7777AAAAAAA7AAACA7GC7GAGCC7TG 30SC 
AAAAAAAAAAAAAAAAAA 3GSS 
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A7GC7GC77GGAGCC7C 7C 7GG7GGGGGCGC7AC7G77C 7 ^G 
CC AAGC7AG7GC 7GAAGC 7GCCC 7GGACCCAGG7GGGA ~7 SO 
C 7CCC7Q77GC 7CC7G7AC 77GGGG7C7GG7GGC7GGCG7 120 
77CA7CCGGG7C77CA7CAAGAGGG7CAGGAGAGA7A7C7. 150 
77GG7GGCA7GG7GC 7CC 7GA AGG 7G A AG AGO A AGG 7GCG 200 
210 220 230 2^0 

AGGG7ACC77CAGGAGCGGAAGACGG 7GCCCC7GC7G777 2 'AO 
GC 77CAA7GG7ACAGCGCCA0C0GGACAAGACAGCCC 7GA 260 
77 77CGAGGGGACAGACAC 7GAC 7GGACC 77CCGCCAGC 7 320 
GGA7GAG7AC7CCAG7AG 7G 7GGCGAAC 77CC7GCAGGCC 330 
CGGGGCC 7GGCC 70 AGGC A A~G7AG7 7GCCC 7C 7 77A~GG 400 
4!C " 420 ti3C 440 
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AAAACCGCA A "GAG 7 77G 7GGG 7C 7G 7QGC 7AGGC A 7GGC 4 AO 
CAAGC7GGGCG7GGA.GGCGGC 70 70 A 7 CAAC aCCAACC 7 7 430 
AGGCGGGA7GC0C 7GCGC0A0 ~G7C 77GACACC7;aaaGG 520 
CAZ^AZiZ 70 70 A 70 77 7G GO A G 7G A G A 7 C- GO Z 70 AGO 7.- 7 330 
C 7G 7G AG A 7C C A 7 G C 7 A. G 0 0 7 G G A G 0 C C AC a C 70 A G 0 0-70 300 

ciO £20 33C 3^0 
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77C7GC7C 7GGA ~CC 7GGGAGCCC A GC AC A G7GC "G7CA £ AO 
G C A \_ A ^ A ^ ^ . s. * - ^ A ^ _ „ • v. : : _ : G G A A Z A '. Z C Z Z C G A A 3 3 0 
GC ACC 7GCCC-G "AC CCA GAC A. A GGG~ 77 7 AC A GA "a A G 720 
C 7 1 " 7C r A C A ~C " A C A C A * C G G GC A C Z A ZZ GG GC "ACC C A 730 
A A GC 7G0C A 77 G "GG 7 G C A ~ A GC AGG 7 A "7a 7 ZZ "GGC 'HOC 

£10 520 3GC 3-0 



7 7CCC7GG 7G 7AC 7 a 7GGA 7 7 GO GO A 7GCGGCC7GA 7 GAC S —0 
A 77G7C7A "GAC 7 GOO 7 COOOC 70 7 ACC AC 7CAAGCAGGA 330 
A AC - AG. ■^■^•u-^A « ; ^c-~A^ ; o ^ i i AC r CI AC GGC A "GAC S20 
7G r GG 'GA "CC GG A AGA AC 7 7C^CAGCC7:C0GG r 70 7GG 930 
G A 7 G A 7 7G 7 A 70 A A G 7 A 0 A A 0 7G 0 A C A G "GG 7 A C A G 7 A C A i OCC 

1 C I G 1020 iCCO iC<- 
• ■ ■ - ' • • • - 1 f . .. . i ... . ; f 

77GGCGAGC 70 7GC0GC "ACC 7CC 7GAACCAGC Z A COO C G 1040 

i GAGGC ; GAG 70 : . OGGC AO A AGG 7GCGCA 7GGCAC 7GGGC 1030 

AACGG r C 7CCGG0 AG "00 A "C 7GGACCGAC 77C 700 A GOO 1 i20 

G777C:AOA7CO0OCAGG r GGO7GAG7707A7GGGGC0 AC 1 ISO 

7GAA 7GCAAC7G 7A.GC0 r GGGC AAC777GACAGCCGGG7G 1200 
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6GGGCC7G7GGC77CAA7AGCCGCA7CC7G7CC7TTG7G7 12<i0 
ACCC7A7CCG777GG7ACG7G7CAA7GAGGA7ACCA7GGA 12S0 
ACTGATCCGGGGACCCGATGGAGTCTGCATTCCCT6TCAA 1 "20 
CCAGG7CAGCCAGGCCAGC 7GG7GGG7CGCA7CA7CCAGC PcO 
AGGACCC7C7GCGCCG777CGACGGG7ACC7CAACCAGGG l£oG 
1410 142G I'42G mc 
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7GCCAACAACAAGAAGA77GC7AA7GA7G7C77CAAGAA'-- I 

GGGSACC A AGCC "ACC 7CAC7GG 7GACG 7CC7GG 7GA 7GG I '4£0 

A7GAGC7GGG77ACC7G7AC77CCGAGA7CGCAC7GGGGA 1=20 

CACG77CCGC7GGAAAGGGGAGAA7G7A7C7ACCAC7GA3 i 5cG 

G 7G6 AGGS C A C A C 7C A G CC G CC 7G C 7 7C A 7A 7GGC A G A 7G IcCG 

1510 1520 igSG 1540 

• • • ■ 1 -• • *- • 1 1 • • • 1 • • ■ ■ 1 • ■ • • 1 ■ • • • ! ■ • . ■ ' • ■ ■ ■ » 

7GGC AG 7 7 7 A 7G G 7G 7 7 G A G G 7G C C A GG A A C 7GAAG G C C i = ^" 

AGCAGGAA7GGC7GCCG775CAAG7CCCA7CAGCAAC7G7 1S30 

GACC7GGAGAGC777GCACAGACC77GAAAAAGGAGC7GC i 720 

C7C7G7A7GCCCGCCCCA7C77CC7GCGC77C77GCC7C-A >7zO 

GC7GCACAAGACAGGGA.ee 77CAAG77CCAGAAGACAGAG icGG 

ISIG i=2G \z2C M-Z 

• • ■ • ' • • • 1 • • • 1 1 • • • • 1 • • • • 1 • • 1 - ■ ■ ■ 1 ■ ■ ■ ■ ; 

7 7 G C G G A A G C- A.GGGC 777 G A C C C A 7 C 7 G 7 7 G ~ G AAAGACC • c 4 0 
CGC7G77C 7A7C 7GGA "GC 7CGGAAGGGC 7GC7ACG77GC i EcC 
A C 7 G G A C C A. G G A. G G C C 7 A 7 A C C C G C A 7 C C A G G C A G G C G A G i 920 
GAGAAGC 7G7GA 777CCC C C "AC A TC C C 7C "GAGC-GC CA Z i SSG 
A A G A 7 G C 7 G G A 7 7 C A G A G C C C 7 A. G C G 7 C C A C CCCA.GAGGG 2 C GO 

2G10 2G2G 2G2C 2G40 

7CC 7SGGCAA 7G CC AG ACC A A AGO 7 A CCA GGGCC I G C A C C 20 -0 
7CCGCCC"AGG~GC7GA 7 7. 7CICC 7; 7 C".CAAA "GC C.A 20-EC 
AG r GAC ~CAC 7GCCGC 77 CCCCG A "0 7: IAGAGCC777C 2 I 2C 
7G~GAAAG7C7CA7CCAAGC'G7G7C7 7C7:G r CCAGGC;; 2 ' cC 
7GGCO"7GGCCCC AGGG ~77C7GA7aGGC^:777AGGA 2200 

2210 2220 2220 22'^C 



7GG7A7C7 7GGG7CCAGCGGGCCAGGG7G7GGGAGAGGAG 22-0 
7CAC7AAGA"C007CCAA"CAGAAGGGAGC77ACAAAGGA 2250 
ACCAAGGCA A A GCC7G7AG AC 7CAGGAAGC7AAG7GGC: A 2220 
GAGAC7;7AG7GGC0AG T Gi7CCCA"G r :CA.:AGAGGA*; 22zC 
77GG7CCAGAGC7GC0AA AG7G7CACC7C7;.:C7G0C7GC 2-IG 
2^;0 2^20 2'42C 2--G 

AGO 707GGGGAAAAGAGGA0AG0A 7G~GGCC AC 7GGGC AC 2--G 
C 7G 7C 70 A AG A AG 7GAGG A 7CACACAC 7CAG7CC77G77T 2-c0 
C7CCAGG77CCC77G77C77G"C7CGGGGAGGGAGGGACG 2520 
AG7G7GC7G7C7G7CC7 7CC7GCC7G"C7G7GAG 7 G7G r G 25cO 
7 7GC 7 70 7C 0 A 7 C 7 G 70 C 7 A G C C 7GAG 7G7GGG7GGAACA 2S0G 
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F [ R V F i X 7 V R R G l F G G u V L L X V K 7 K V R R Y L C £ R !< 7 V F •■ I F SO 

AS«vaF.HFGK7ALtr£GT07KV7FRGL0EYS2SyANFLG4 120 
R G L A 3 G M V V A L F fl £ N R N o F V G L WL G fl A X L G V £ A A L ? iN 7 N L 150 
R R 0 A L R H C L 0 7 3 X A R A _ ! FGSEMASA I C£ HA3L£F 7LSL 200 

210 220 220 2«iO 
1 i . i 

FCSGSW£?STVrV£7cHLOFLLcOA.= KHLrSH?i:K3rTQK 2*0 
LFY i Y73G77GLFXAA'[ V V H S ;R Y Y R ."t A 3 L V Y Y 5F R it R F fi G 2~0 
[ V YOCLFL YnS£«:<:-.F.GGWGCLLHG« 7V V i RXXF2 A2RFW 350 
CQC i KYNC7VVQY £ G £ L C R Y L L N G F F R £ A £ 5 P. H :< V R >! A L G 360 
NGLftCS [W70FS2RFH [FCVA£~Y3A7£CNC3LQ.VF-jSS¥ 400 
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GACGFNSR iLSFYYF L RL VR VN£D7^£L iRGFCGYC SFCG 440 
FGOFC-CLVGR i I C C C F R R F 0 G Y L N C G A N N < '< -ANCVFXX 450 
C- G C A Y 7 G •:■ V L VMC£LG V! _Y F = C R 7GG7 F .=, '.v!< "£N7£ 7 7£ 520 
V £ 2 7 L : R L L H M A C V A 7 V G 7 £ V F G7-£G R A * A A V i 3 F i 2 N C = 3 0 
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10 20 3G ciG 

CAC7CA7CAGAGC7AAGAGAGAC7ACACGC7C7CA7C7AC 40 
TTCAGAAAGAGCCAATGCCATGGGTATTTGGAAGAAACrA SO 
ACC 77AC7GC7G77GC7G0 77C7GC7GG77GGCC7GGGGC 120 
AGCCCCCAIGGCCAGCAGC7A7GGC7C7GGCCC7GCG77G ISO 
G77CC7GGGAGACCCCACA7GOC77G7GC7GC77GGC77G 200 

210 220 230 2'iQ 
i i .... i ■ i 

GCA77GC7GGGCAGACCC7GGA7CAGC7CC7GGA7GCCCC 2^0 
AC7GGC7GAGCC TGG7AGGAGC AGC 7C 7 7ACC 7 7A 7 ~CC 7 250 
A77GCC7C7ACAGCCACGCCCAGGGC7ACGC7GGC7GCA7 320 
AAAGA7G7GGC777CACC77CAAGA7GC 77 77C7A7GGCC GcG 
7AAAG77CA.GGCGACGC0 77A AC AAACA7CC 7CCAGAGAC ^00 

^ 10 420 430 ' 440 
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C777G7GGA 7GC 77 7AGAGCGGC AA GCAC 7GGCA7GGCC 7 ^-0 
GACCGGG7GGC0 77GG7G7G7AC7GGG7C7GAGGGC7CC7 4c C 
CAA7CACAAA 7AGCCAGC 7GGA 7GCCAGG7CC7G7CAGGC £20 
AGCA7GGG7CC 7GAAAGC AA AGC 7GAAGGA7GCCG7AA TZ zzC 
CAGAACAC A AG AG A 7GC7GC7GC 7 A 7C 77AG7 7C 7CCCG 7 £00 

£10 620 £20 £^C 
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CCAAGACCA77 7~7GC7 77GAG7G7G7 7 7C~GGGG77GGC 6-0 
CAAG77GGGC 7GCCC 7G "GGCC 7 GG A 7C A A "CCA I AC AGC ScC 
CGAGGGA 7GCCC 1. 1 G C ■ AC AO • C \ G ; - C G G A G C 7 * 7GGGG 720 
CC AG 7G 7G C 7G A 7 7 G ~ GG A 70 C A Z A C C 7C Z AG G A G A ACC" 750 
G G A A Z A A G "CC 7 7CCC A AGC 7GC 7AGC 7GAG a: Z A 7 7 0-1 £ CC 

£:0 £20 £20 £-0 



7GC77C7ACC77GGCCACAGC7CACCC*iCCCCGGGAG7AG £40 
AGGC 7C7GGGAGC 77C0C 7GGA 7 GO 7GC ACC 7 7C 7G AC CC ££C 
AG r ACC 7GCC A GC C 77CG A GC 7 ACG A 77A AG r GGAA A r C 7 520 
CC 7GCCA7A r 7CA7C 77 7 AC 7 7CAGGGACC AC 7GGAC7CC S£C 
C A AAGC CAGCC A 7C 77 A 7C AC A "GAGCGGG7CA r iCA AG r iCCO 

io;o 1020 ;ccc iG4G 



GAGCAACG7GC 7G 7 CC 7 70 7 G 7GGA7GC AGAGC 7GA7GA7 1040 

G7QG7C7A7GACG r CC7ACC7C7G7ACCA7ACGA7AGGGC !0£0 

77G7CC77GGA77CC77GGC7GC77ACAAG77GGAGCCAC 1 120 

C 7 G 7 G 7 C C 7 G G C 0 0 C C A A G 7 7 C 7 C 7 G C 0 7 C C C G A 7 7 C 7 G G i IcG 

GC7GAG7GCCGGCAGCA7GGCG7AACAG7GA7C77G7A7G 1200 
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TGGG7GAAA7CC7GCGG7AC77G7G7AACG7CCC7GAGCA 12^0 
ACCAGA AG AC AAGA 7 AC A TACAGTGCGCTTGGCC A TGGGA 1280 
ACTGGACTTCGC-GCAAATGTGTGGAAAAACTTCCAGCAAC 1320 
GC777GG7CCCA77CGGA7C7GGGAA77C7ACGGA7CCAC 1360 
AGAGC-GCAA7G7GGGC77AA7GAAC7A7G7GGC-CCAC7GC 1 400 
1410 1420 143G 1440 

GGG5C7G7GGG A A GG AC CAGC7GC A 7CC77CGAA7GC7GA 1440 
C7CCC777GAGC77G7ACAG77CGACA7AGAGACAGCAGA 14S0 
GCC7CTGAGGGACAAACAGGG77777GCA77CC7G7GGAG 1:20 
CC AGGA A AGCCAGG AC 77C7777GACCAAGG77CGAAAGA 1560 
ACCAACCC77CC7GGGC7ACCG7GG77CCCAGGCCGAG7C 1600 
1c 10 1520 1S3C ' t64C 
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CAA7C5GAAAC77G77GCGAA7GTACGACGCG7AGGAGAC ic^O 

C7G7AC77CAACAC7GGGGACG7GG7GACC TTGGACC AGG \zcG 

AAGGG77C77C7A.C777CAAGACCGCC77GG7GACa:C77 1 720 

CC3G7GGAAGGGCGAAAACG7A.7C7AC7GGAGAGG7GGAG' 1 7cO 

7G7G7777G7C7AGGC 7AGAC 77CC7AGAG5AAG7CAA7G IcOG 

iciO 1320 icGC i=^G 
• * • • f • • • • 1 * • • ■ f • • • • ' ■ • • • • • • f • • ■ 1 ■ • • • 1 

7C 7 A 7G G 7G 7 G G C 7 G 7 G G C A G G G 7 G 7 G A G G G ~ A A G G r 7GG ! c -0 
CA7GGC7GC7G7GAAAC7GGC7CC7GGGAAGa:7 777GA7 iccC 
GGGCAGAAGC r A7ACCAGCA7G7CCGC7CC7:GC7CCC7G 1920 
CC7A"GCCAGACC7CAT77CA7CCG7ATCCAGGA7 7CCC7 i9cC 
GGAGA7CACAAACAGC 7AGAA GG 7GG7A.A AG 7GACGGC 7G 2CGG 

20 iQ 2G2G 20GC 20*G 

i .... t ... . i , . . . i . ... i ... . > 

G r GCG 7GAGGG 7 77 7GA 7G 7GGGGA 7G A 7 7GG t Ga;CCCC 20 ^G 
7C7ACA7AC7GGACAACAAGGCC:AGACC r? ::GGA"C7 2GS0 
GA "GCAGA7G~G~A GC AGGC 7G r G 7G7GAA GG-A7C7GG 2 :2C 
A A 7C 7G7GACGACG 7A GGC A AC 7GGAAGGGA A 7CCAA AiG 2 ; zC 
7G r AG AGA77GA CAC 7AG7C AGG 7 7CA.CAAA I~7G7"GG 22GC 

22 iC 2220 22GG 22^G 



G77CC AGA 7GGCC A 7GGGCG A G7AG 7 AC 7 7a GAGA A 7A A A 22-0 
C 7 7 G A A 7 G 7G "A 7 i G A A A A A A A A A A A A A A A A A A A A A A 2277 
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KLXjA V iGNTFCAAA l ! _VLFSX7 ? 3ALSVF LGLAXLGGF V 200 

210 220 220 240 
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A T I K W K 3 .= A I F [ F T 2 G 7 T G L F K F A i L S H S F 7 i G V S N V !. 5 F 2 2 0 
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F 3 A. 2 F F W A £ C F G H G V 7 V : LY VGE [ LFYLCNV=£GF EGX \ H 400 
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7VH LA.tG 7-3LSAN VWK NFGGS FQF : = I WSF Y G S T£GN V GL. '440 
MN Y V C-HC GA V GF 7 5 C • L F. Y! L 7 F F £ _ V G F j I E~A£?LFC :< :• 450 
G F C i F V £ . = G !< F G L L L 7 X V F X N G F F _ G Y F G 2 G A £ 2 N F X '_ V A 520 
N V F F V G 3 L Y F N 7 G 0 V L 7 C 0 EGFFY F C G F L G Z 7 F F W X G 'E.N £ 2 0 
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GC7C7C7GGGCC7A7A7CAAGC7GC7GAGGTACACGAAGC *C 
GCCA7GAGCGG07CAAC7ACACGG7GGCGGACG7C77CGA SO 
ACGAAA7G77CAGGCCCA 7CCGGA.C A A GG7GGC7G7GG TC 120 
AGTGAGACGCAACGCTGGACC 7TCCG7CAGG7GAACGAGC 150 
A7GCGAACAAGG7GGCCAA7G7GC TGCAGGC TC AGGGCTA 200 

2 SO 22C 230 2 'AO 
* * ■ i • • • > f ■ • ■ . t .... > . . . , i .. ..(,, , , i 

CAAAAAGGGCGA7G 7GG7GGCCC 7GTTGC7GGAGAAGCGC 2*M3 
GCCGAG7ACG7GGCCACC TGGC TGGGTCTCTCCAAGATCG 28G 
GTG7GA7CACA0CGC7GA.7CAACA0GAA7C7GCGGGG7CC 220 
C7CCC7GCrGCACAGCA7CACGG7GGCCCA r 7GC7CGGC7 350 
C7CA777ACGGCGAGGAC7TCC7GGAAG0 7G70AC0GACG ^00 

7GG0CAAGGA7G7G0CAG0GAACC7CA0AC7e77C0AG77 H-Q 
CAACAACGA.GAACAACA AC AGCGA.GACGGA.AAA.GAACA7A ^50 
CCGCAGGCCA.AGA.A 7C 7GA. A CGCGC 7GC TGACCACGGCCA 520 
GC7A 7GAGA AGCC "A AC AAGACGC AGG 77AACCACCACGA 550 
C AAGC7GG 7CTACA TC TA CACC 7CCGGC ACC AC AGGA T;7G £00 

£10 £20 £20 £40 
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CC AAAGGC 7GCGG 7 7A"7C7C AC 7CCCG 7 7A7C7G7 7 7A 6*0 
70GC7GC7GGCA 7 C 0 A C 7 A C A C 0 A ~GGG 7 7 7C C A GG A G G A 550 

GGACA7C7:c7Ac^eoc:e 7 7gcc777g 7 ^::aca::gc" 720 

GG 7GGCA T "A ~G r GO A "GG G ""A G 7CGG ~GC 7C 777 GGC 7 750 
CCACGG7C7:CA T *7CGCAAGAAG77C7CGGCA r CCAAC7A 200 

5!0 520 £20 £40 



T77CGCCGAC 7G0GCCA AG T A 7A A 7GC A ACTi 7 7GG7CAG c^O 
7A 7A7CGG "GAGA 7G GO TOGO 7ACA77C 7AGC 7 ACG A A AC £50 
CC7CGG A A 7 A CGACC AG A A A C A C C G A G 7 G C G * C 7 G G 7 C 7 7 52 0 
7GGAAAC GGAC 7GCGA CO GO A GA " 7 7 GGC CaCAG t 77G 7G 950 
CAGCGC77CAAC^ r 7GCCAAGG r 7GGCGA:-7C7ACGGCG 1000 
10 iO 1020 iCCO 10^0 



CCACCG A GGG "AA 7GCG A AC A 7C A 7GAA "CA 7GACA AC AC IG^C 
GG 7GGGCGCC A. 7CGGC 7 7 7 G r G 7CGCGCA 7CC 7GCCCA AG i OSC 
A7C7ACCCAA7C7CGA7CA "CGCGCCGA~CCGGACACCG i ! 20 
GAGAGC CC A 7 7AG A G A. 7 A.GG A A 7GGCC 7A 7GC C A AC 7G7G 1 IcO 
CG07CCCAACGAGCOAGGCG r A77CA^CGGCAAGA7CG7C 1200 
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AAAGGAAA7CC77C7CGCGAA77CC7CGGA7ACG7CGA7G 1 2^0 
AAAAGGCC7CCGCGAAGAAGA77G77AAGGA7G7G77CAA 12*0 
GC A 7GGCG A 7A 7GGC 7 7 7C A 7C7CCGGA.GA7C7GC7GG77 132G 
GCCGACGAGAAGGG77A7C7G7AC77CAAGGA7CGCACCG 1 360 
G7GACACC77CCC-C7GGAAGGGCGAGA.A7G~uCCACCAG 1 400 
1410 1420 142G 1440 

CGAGG7GGAGGCGCAAG7CAGCAA7G7GGCCGG77ACA.AG 1440 

GA7ACCG7CG777ACGGCG7AACCA77CCGCACACCGAGG 1 4S0 

G A A.GGGCCGGCA~GGCCGCC A. 7C7A7GA 7CCGGAGCGAGA 1 =20 

A77GGA.CC7CGACG7C77CGCCGC7AGC 77GGCCA.AGG7G I 550 

C7GCCCGCG7ACGC7CG7CCCCAGA7CA77CGA77GC7CA 1500 

1c 10 1c 20 1520 1540 
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CCAAGG7GGACC7GA.C7GGAA.ee 777AAC-:"GCGCAAGG7 1540 

A G A C C 7 G C A G A A G G A. G G G C 7 A C G A 7 C C G A A C G C G A 7 C A A G 155 0 

GACGCGC7G7AC7ACCAGAC 77CCAAGG G7CGG7A.CGAGC 1720 

7 G C 7 C A C G C C C C A G G 7 7 7 A 0 G A 2 0 A G G 7 G C A G C G C A A C G A 1750 

A A 7CCGC 77C 7AAGA GC 7GCAA 7AGAG772- ~G7C 7GAACC i 500 

IciO 1520 1520 i£4C 

1 1 •••• 1 1 '■■ 1 

77GCC 777 7GCCC AA. 7A "GC 7G "7AA 77AG "77G 7AA.GGC 15-0 
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TCC77G7GCCGGG7AACGA7GG7CGAG7GCGCAGCC7CAA 1540 

77G7CA7GGCAGACGGCG7GACAGAG7CGACA77CGC77C 1650 

GC7SCCC77GCAAAGCACGCCCGAGA7CGG7TACCGGG7T 1720 
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CAGGAAGG7A7AGACCCAGA7AAGA777CC2GCGAAGA7A 1640 
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CTT7ACCA77CATC AGC77CA77C7GCA77777AGC7TGA 40 
CGGCAGCCGGG7C7ACGC7GA7CA7CGGCCGCAAG77C7C £0 
CGCGAGAAAC77CA7AAAGGAAGCGCGCGAGAACGACGCC 1 20 
ACGG7CATCCAGTACGTGGGTGAGACCTTGCGATATCTGC 160 
7CGCCACCCCCGGTGAAACCGA7CCAGTTACTGGCGAAGA 200 
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CC7GGACAAAAAGCACAA.7A77CGAGCAG7A7ACGGCAAC 240 
GGGC7ACGGCCC-GA7ATC7GGAACCGC77CAAGGAGCGC7 230 
7CAACG7GCCGACGG77GCCGAA7777A7GC7GCAACCGA 320 
GAGCCCAGGCGGAA.CA7GGAAC7A77CAACAAA7GAC7TC 360 
AC7GCCGGAGCCA77GGGC ACAC7GGCGTGC77AG7GGA7 400 
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GGC77C77GGACGCGGCC77AC7A77G7CGAGG7GGACCA 440 
GGAA7CACAGGAACCA7GGCGCGA7CCCCAAA:CGGG77C 4c0 
7C-CAAGCCGG7CCCGCGAGGCGA.AGCAGGCGAGC7CC7G7 520 
A7GCCA77GA7CCC-GCCGACCCGGGCGAGACC77CCAGGG 560 
C7AC"ACCGCAAC 7CC TTTAGAGCACAC 7GGCGGCGG 537 
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GCAAAGGCCGACGCGTGGC7GCGGACGGG7AACG7GATCA W 
GGGCGGACAACGAAGGGCGAC7C77C77CCACGACCGGA7 SO 
CGGAGACACGrTCCGAfGGAAGGSAGAGACNGTCAGCACA 120 
CAAGAGG~CAG777GG7GC7CGGACGACACGACrCAA7CA 160 
AGGAGGCCAACG7G7ACGGCG7GACGG7GCCGAACCACGA 200 
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CGGGCGGGCCGGC7GCGC7GCGC7CACGC7A7CAGACGC7 2^0 
C7GGCGAC7GAAAAGAAGC7GGGCGA7GAGC7GC7AAAGG 2S0 
GATTGGCrACTCACTCGTCGACTrCGCTTCCCAAGrTTGC 320 
GG7GCCGCAG77CC7ACGGG7GG7GCGCGGCGAGA7GCAG 360 
7CAACGGGCACCAACAAGCAACAGAAGCACGACC7GAGGG ^GG 
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CC7777GC7ACAAC7ACC770CAGAAAGG7GAC777GGAA 1240 
T7GGCGCA7G7AGGAAC7A7GG7AC7A7AA 77CAA7GG77 1280 
TTTGTCATTCCAACAAACATTGGrAAGGATGGACCCAAAT 1320 
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AAAG7ACC 7AAA7A7GAAGG7AGAGC7GG7777GCAG77A 1720 
T7AAAC7AAC7GACAAC7C7C77GACA 7CAC7GCAAAGAC 1 75C 
CAAA77A77AAA7GA 77CC77GAGCCGG77AAA7C7ACCG I €00 
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7C77A7GC7a7GCCCC7A777G77AAA777G:7Ga7GAAA 1540 
T 7AAAA7GACAGA 7AACC 7C A ~AA AA7777GA 1572 
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G7G7CCGA 77 AC 7ACGGCGGCGCACACACAACGG7CAGGC 
7GA.7CGACCTGGC AAC 7CGGA 7GCCGCGAG~GT7G'GCGGA £0 
CACGCCGG7GA77G7GCG7GGGGC A.A7GACCGGGC7GC 7G 120 
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AGGACCGGGCCGC7CGC7ACGG7GACGGAG7C77C07GAA 2G0 
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6GCC7G7TGC77AGCCGGG7CAACCGGC7GCAGCCG77CG 1 240 
ACGGC7ACACCGACCCGG7TGCCAGCGAAAAGAAG77GG7 12S0 
GCGCAACGCTTTTCGAGATGGCGACTGTTGGriCAACACC 1320 
GG7GACG7GA7GAGCCCGCAGGGCA7GGGCCA7GCCGCC7 1360 
7CG7<;GA7CGGC7GGGCGACACC77CCGC7GGAAGGGCGA HOC 
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GAATGrCGCCACCACTCAGGTCGAAGCGGC ACTGGCCTCC 1 440 
GACCAGACCG7CGAGGAG7GCACGG707ACGGCG7CCAGA 1 460 
77CCGCGCACCGGCGGGCGCGCCGGAA7GGCCGCGATCAC 1 520 
AC7GCGCGC7GGCGCCGAA77CGACGGCCAGGCGC7GGCC 1560 
CGAACGG"77ACGG7CAC77GCCCGGC7A7GCAC77CCGC 1600 
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7C777G77CGGG7AG7GGGG7CGC7GGCGCACACCACGAC 1640 
G77CAAGAG7CGCAAGG7QGAG77GCGCAACCAGGCC7A7 1650 
GGCGCCGACA7CGAGGA7CCGC7G7ACG7AC7GGCCGGCC 1720 
CGGACGAAGGA7A7G7GCCG7AC7ACGCCGAA7ACCC7GA 1750 
GGAGG777CGC7CGGAAGGCGACCGCAGGGC7AG 1794 
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1 cga err ace cgr ceg cge ere ffeg a?t jp: aae cb? acj erg ccg aca gag gaa 

61 sc? ?ec === acg gc= gee ctre erg erg erg ere erg erg ceg erg era ceg erg ere erg 
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121 erg erg aag er* esc err rgr; erg cag :rg c^: ^ c:; cc; gc? 52c rrg gcr err ceg 
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241. gae erg gaa ger err gag egg gge 5ge age erg gec tgg egr ere 7=7 gaa erg gee eag 
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301 eag ere gee geg cac ace ere ere «r ear s?c *^ cn eje ire agr eae tea gag scg 

95 CHAASTriIHCSaR?«VSEA 

3£1 gag cge gag age aac agg gee gca cge see 5:: cu q-. sc; era 55= egg sac egg gga 
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421 err gs.e gge gge gae age gge gag ggg agr — gga ga* — gag rgg gca grg erg gga 
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12S1 see egg ccc; sea. fftg cer a^c egg ccc cgc cea sac acc tgy cac cgc ccrr gec- cgc- cgc 

IZ2L sec egg- ccc cry cac gcg cc? «*=a cac 55a ccg aca. gag gee aac gcg gee acc ace 

435 ?C?I.QVLET^GLTHCN^V>ATr 

13ai azc cac aca sea ee-e egg 3?= S 5 ^ 7=" S3» cgc gec tec zgg ccc cac aag cac acc ccc 

4f5 NYTGCRGA^GRASffLYKHIF 

1441 ccc ccc zzz ccg acc cgc cac gat gec acc ac* gga cac cea acc egg gac ccc cag egg 

475 PFSLIRYDVTTG2PIROPQG 

1501 cac tgc aeg gec aca ccc cea ggc gag cca ggg ccg seg geg gee ccg gca age cag cag 

«tS5 HCMATS?C£P,GLiVAPVSQQ 

1561 esc cca tec ccg ggc car gec ggc gec cca gag czz gee cag ggg a-ae ccg era asg gac 

Isll gec its egg csr egg gac gec ccc zzc aae acc. egg gac ccg ccg gee tgc gat g=c caa 

1£32 ggc etc ccc ege ::: cac gac cgc acc gga gac acc zzz agg egg aag ggg gag aac gcg 

£53 G r L R F H 0 RTGDT?RW-SGS2IV 

ITsZ grc aca acc gag ccg gca gag gcz zzz gag ges cza gac ccc sec cag gag gcg aac gec 

573 ATTSTASvrSAtOFtCrvsy 

1201 cac gga gee ace gcg zea ggg ci: gai ggc egg gc- gga, a^g gca gee cca gec ccg ege 

Srz Y G V T V ? G 2 Z G ?. A G A A t. V L R 

1SSI ccc ccc cac gec ccg'gac ccc a:; cag ccc cac acr cac geg ccc gag aac erg cca cc- 

51£ j9SALD f -fcC:.T7::VSE3JLr? 

lrcl cac gee egg ccc cgs ccc ccc egg zzz cag gag zzz zzz zzz acc aca gag acr cec aaa 

525 v A 3,psirLRLCra£.AT7£TrX 

1581 cag cag aaa gee egg acg gca. aat gag ggc ccs gac c== age acc ccg ccc gac cca ccg 

$== QQSVRllAN£CFw?SrLSC?L 

2C^i cac gc; ccg gac cag gec gca gec cac r=g zzz ccr aca. acc gec egg cac age gec 

573 YVL02A'/GAYw?i?TARYSA 

21C1 ccc ccg tzjz gga aac ccc cga acc cga gaa rcc :~a cac rcg agg cac ccg aga gag raa 

Sr3 C. i A G N C R r 

2131 cec =gc 
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G * QQATGGCTGGTGGTC-TCCAACipSSSSSSSdA 

GGCCGGGTGATCCGGCTGGqG^i^^SS GTAG 
* a TT CG^^Ml^C-AACTAAGTAACAAAAGG 

> C » GAGTCCATGGGTCACATTG AGTTGCTGATAC- 
TAC TTGGTCATATTTGGGAAGTC-GGTAGACAGAT 
TTCCTTAAAGGCAGGTAGTTAGC-C-CTTTGGAGCA 
CTCATCAGAGCTAAGAGAGATTACACGCTCTCAT 

CTACTTCAGAAAGAGCCAATGCCATGi - 
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HsFATPj Qohni003f04): 

CCCGGGTTTCTGCTCTCCGCCCSTGTGGAGTGGTG^^ 
GCGCTCCC 

rGGAAGGAGAAGTCTt^GCl^GAACSAGCGCCCCra^ 1 ' T GCCaGC 
GGCTGGAA 

CCAGACGCTCCCCATMAGCAAGCGGGCTCCATCGCTGCC^ 
GCTGCTGC 

rGCTGAAGCT^CACCTCTGGCCGCAGTTGCOCTCGCTTCCSGCGGACTTGGCCTTTGCGGTGCGS.GCTCXQX 
GCTGCAAA 

AGGGCTCTTCGAGCTCGCGCCCTGGCCGCGGCTGC 
GCCTGGCG 

CCTCGCGGAAC?GGCCCL\GCAGCGCGCCGCGCACACC^ 
AGAGGCGG 

AGCGCGAGAGTT^CAGCCCTGC^CGCXSCCtrrCCTACGTGCGCTAGGCTGGGACTGGGCACCCGACGGCGGCG 
ACAGCGGC 

G AGGGG AGCGCTGG AG AAGG CG AG CGGGCAGCGCCGGG AG CC GG AG ATGCAGCGG CCGG AAG CGC C GCGG AG 
TTTGCCGG 

AGGGGACGGTGCCGCCAGAGGTGCAGGAGCCGCCGCCCCTCTGTCACCTGGAGCAACTGTGCCGCTGCTCCT. 
CCCCGCTG. 

GCCCAGAGTTTCTGTGCCTCTGGT^CGGGCTGGCO^GGGCGGCCTGCSCACTGCCTTTGTGCCCACCGCCC 
tTGCGGCGG 

GGCCCCC^GCTGCACTGCCTCCGCAGCTGCGGCGCGCGCGCGC^GGTGCTGGCGCCAGAG-^TCTGGAGTCC 
CTGGAGCC 

GGACCTGCCCGCCCTGAGAGCCATGGGGCTCCACC'TGTGGGCrrGCAGCCCCAGGAACCCACCCTGCTGGAAr 
TAGCGATT 

rGC~GGC?GAJ\G~GTCCGCTGAAGTGGATGGGCCAGTGCCAGGAr^^ 
ACACGTGC 

CTGTACArCTTCACCTCTGGCACCACGGGCCTCCCCAAGGCTGCTCSGArCAGTCATCTGAAGATCCTGCAA 
TGCCAGGC 

cttcta-cagctgtgtggtgtccaccagcaaga-gtga-c^cctcgccctcccactctaccacacgtccgg 
ttccctgc 

tgggca:tcgtgggc?gca?gggcattggggc:^cag7ggtgctga^ 

GGGAAGAr 

rGCCAGCAGCACAGGGTGACGGrGTTCCAGTACArrGGGGAGCTGrGCCGACACCTTG 
AGCAAGGC 

AGAACGTGGCC^rAAGGTCCGGCTGGCAGrGGGCAGCGGGCrGCGCCCAGA 
GCGCTTCG 

GCCCCCTGCAGGTGCTGGAGACATArGGACTGACAGAGC^CAACGTG^ 
GGCCC GCT 

GTGGGGCGTGCTTCC-rGG C 'r' rTA CAAGCArArCTTCCCC— crCCTTGArTCS'CTATGArGTCACCACAGGA 
GAGCCAAT 
XCGGGAC 




;tg 

GAAGGGGGAGAArGTGGCCACAACCGAGGT^CAGAGGTCTTCGAC^ 

CGTCTATG 

GAGXCACTGTGCCAGGGC^rGAAGGCAGGGCTGGAAT^ 
ACCTTATG 



~G. ILIA 
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Car^CSACaCCCACGT ^ 'lX.TG^ 

ttccccac 

caooagaccttcaaacagcagaa^^ 
actgtacg 

ttctggaccaggctgtagctccctacc^ 
ttcsaatc 

TG AG AACTT CCA CAC CTGAGGCACCTG AG AG AGGAACT CTGT&GGGTGGCGGCCGTTGCAGGXwTACTGGCC 

tctcAgcg 

atcttttctataccagaactgcggtc^^ 

AAAAAAAA 
AA 
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FATP3 protein: 

DWGFEGGDSG2GS;U^GI3IAAJ^GDAXAG^ 

GLAKAGLS 

EGPVPGTI. 

SSP£S ITDTCL^IIFTSGTTGI^XAA^SHI^XLC^ 

S AG^FTOCCCQERVTVTQYX G2XCSZI.VNQPP SXAZ?.GHXT7KiAVG5GtJl? D TWS^lTVr^J'G? 
GX.T2GNTO 

TXHrTGQRGAVGxL^S*WI.rSHZP?r VTTGZ? 132 ?gGEC»r StPGSSGIiVAPVSQQS? FX.GY aGG? 

KLACGXIX 

JCD VTBJ?GD VTTJTTG C LX.VCT C CGTU^THE xlTGCT7^ r ^~2NV3lTTZV3^/7ZALJ3 3T.QIT/NVTGVTVPGHZG 
SAIXAGNUIZ 



FIG. 112 
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Peptide Sequence of hsFATP fragment* for fatty acid binding experiments: 
SP1: 

RVFIKTII^IFGGL VIJLKVKAKVRQCL QERRTVPIIJASTVREHPDKTALIFEGT 
DTHWTFRQLDEYSSSVANFLQARG 

LAS GDV AAffMENKNEFV GLWL GMAKL GVEAAII^ro^IJlRD ALIJHCLTTSRAR 

ALVFGSEMASAICEVHASLDPSLSLFC 

S GSWEPGAVPPSTEHLDPLLKDAPKHLPSCPDKGFTD 



Fig. 114A 



SP2: 

RVnKTTRPJDIFGGLVLOCV?LA[<^ 
DTK^VTFRQLDEYSSSVANPLQARG 

L^GD V A'UI^IENRNEFVGL'WT GMAKL GVEAAIJNTNTPJID AIXECLTTSRAS. 
ALVFGSEMASAICEVHASLDPSLSLFC 

S GSWEPGAVPPSTEKLDPLLKD APKHLPSCPDKGFTDKLFYnTS GTTGLPKA.iTV 

VTiSRYYR^LAALVYYGFR-NIRPNDn/ * 

YDCLPLYH 

Fis. 114B 



GD%*A\ffMENT^EFVGLWLGNLA^^ 

FGSE3v£ASAICEVHASLDPSLSLFCSGS ■ __ r „„ , 

^GAWPSTEKLDPLLK^ 



SRYYI^lAALVY^"GFElNCRPNDrv™i"DC 
LPLYK 

Fia. 114C 



SP5: 

FJAllVNX^T\QELlRG?DGVCIPCQPGE?C^ 

kzakd vtkxgdq a yltgd vl vmdel 

GYLYFRDRTGDTtRWXGE^/STTZVEGTL^ 
MA.A.VASPTGNCD LERJAQVLEXELPLY 
ARFIFimJElSTGTYKrQKTELRXSGF^^ 
AYSRIQAGEE5L. 



Fig. 1 14D 
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Oleate binding to FATP-GST fusions 
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SP1 £P2 SF3 SF5 GST 

FATP-GST 
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Fatty acid binding to mmFATP4 fragments 




F-G. LL7 
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AMP-BINDING DOMAIN 
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^ (54) Title: FATTY ACID TRANSPORT PROTEINS 
ON 

^ (57) Abstract: A family of fatty acid transport proteins (FATPs) mediate transport of long chain fatty acids (LCFAs) across cell 
Jit membranes into cells. These proteins exhibit different expression patterns among the organs of mammals. Nucleic acids encoding 
^! FATPs of this family, vectors comprising these nucleic acids, as well as the production of FATP proteins in host cells are described. 

Also described are methods to test FATPs for fatty acid transport function, and methods to identify inhibitors or enhancers of transport 
® function. The altering of LCFA uptake by administering to the mammal an inhibitor or enhancer of FATP transport function of a 
O FATP in the sma]1 intestine can decrease or increase calories available as fats, and can decrease or increase circulating fatty acids. 
^ The organ specificity of FATP distribution can be exploited in methods to direct drugs, diagnostic indicators and so forth to an organ 
^ such as the heart. 
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FATTY ACID TRANSPORT PROTEINS 

RELATED APPLICATION(S) 

This application is a continuation-in-part of U.S. Patent Application 

Number 09/506,252 filed February 17, 2000 which is a continuation-in-part of U.S. 
5 Patent Application Number 09/465,280 filed December 16, 1999 which is a 

continuation-in-part of U.S. Patent Application Number 09/405,505 filed September 

23, 1999, and is a continuation-in-part of U.S. Patent Application Number 

09/232,195 filed January 14, 1999, both of which claim the benefit of U.S. 

Provisional Application Number 60/110,941 filed December 4, 1998; U.S. 
10 Provisional Application Number 60/093,491 filed July 20, 1998; and U.S. 

Provisional Application Number 60/071,374 filed January 15, 1998. This 
application is also a continuation-in-part of U.S . Patent Application Number 

09/405,504 filed September 23, 1999, which is a continuation-in-part of U.S. Patent 

Application Number 09/232,201 filed January 14, 1999, which claims the benefit of 
15 U.S. Provisional Application Number 60/1 10,941 filed December 4, 1998; U.S. 

Provisional Application Number 60/093,491 filed July 20, 1998; and U.S. 

Provisional Application Number 60/071,374 filed January 15, 1998. This 

application is also a continuation-in-part of U.S. Patent Application Number 

09/232,197 filed January 14, 1999, United States Patent Application Number 
20 09/232,200 filed January 14, 1999 and International Application Number 

PCT/US99/00182 filed January 14, 1999, each of which claims the benefit of U.S. 

Provisional Application Number 60/1 10,941 filed December 4, 1998; U.S. 

Provisional Application Number 60/093,491 filed July 20, 1998; and U.S. 
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Provisional Application Number 60/071,374 filed January 15, 1998. The teachings 
of each of these referenced applications are incorporated herein by reference in their 
entirety. 

GOVERNMENT SUPPORT 
5 The invention was supported, in whole or in part, by a grant from the 

National Heart, Lung, and Blood Institute (HL41484), by National Institutes of 
Health Grant DK 47618 and National Institutes of Health Grant 5 T32 CA 09541. 
The United States Government has certain rights in the invention. 

BACKGROUND OF THE INVENTION 

10 Long chain fatty acids (LCFAs) are an important source of energy for most 

organisms. They also function as blood hormones, regulating key metabolic 
functions such as hepatic glucose production. Although LCFAs can diffuse through 
the hydrophobic core of the plasma membrane into cells, this nonspecific transport 
cannot account for the high affinity and specific transport of LCFAs exhibited by 

15 cells such as cardiac muscle, hepatocytes, enterocytes, and adipocytes. The 

molecular mechanisms of LCFA transport remains largely unknown. Identifying 
these mechanisms can lead to pharmaceuticals that modulate fatty acid uptake by the 
intestine and by other organs, thereby alleviating certain medical conditions (e.g. 
obesity). 

20 SUMMARY OF THE INVENTION 

Described herein is a diverse family of fatty acid transport proteins (FATPs) 
which are evolutionarily conserved; these FATPs are plasma membrane proteins 
which mediate transport of LCFAs across the membranes and into cells. Members 
of the FATP family described herein are present in a wide variety of organisms, from 

25 mycobacteria to humans, and exhibit very different expression patterns in tissues 
among the organisms. FATP family members are expressed in prokaryotic and 
eukaryotic organisms and comprise characteristic amino acid domains or sequences 
which are highly conserved across family members. In addition, the function of the 
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FATP gene family is conserved throughout evolution, as shown by the fact that the 
Caenorhabditis (C), elegans and mycobacterial FATPs described herein facilitate 
LCFA uptake when they are overexpressed in COS cells or Escherichia (E.) coli, 
respectively. FATPs are expressed in a wide variety of tissues, including all tissues 
5 which are important to fatty acid metabolism (uptake and processing). 

In specific embodiments, FATPs of the present invention are from such 
diverse organisms as humans (Homo (H.) sapiens), mice, (Mus (M.) musculus), F. 
rubripes, C elegans, Drosophila (D.) melanogaster, Saccharomyces (S.) cerevisiae, 
Aspergillus nididans, Cochliobolu heterostrophus, Magnaporthe grisea and 

10 Mycobacterium (MJ, such as M. tuberculosis. As described herein, four novel 

mouse FATPs, referred to as mmFATP2, mmFATP3, mmFATP4 and mmFATPS, 
and six human FATPs, referred to as hsFATPl, hsFATP2, hsFATP3, hsFATP4, 
hsFATPS and hsFATP6, have been identified. All four novel murine FATPs 
(mmFATP2-5) and a previously identified murine FATP (renamed herein FATP1) 

15 have orthologs in humans (hsFATPl-5); the sixth human FATP (hsFATP6) does not 
as yet have a mouse ortholog. The expression patterns of these FATPs vary, as 
described in detail below. 

The present invention relates to FATP family members from prokaryotes and 
eukaryotes, nucleic acids (DNA, RNA) encoding FATPs, and nucleic acids which 

20 are useful as probes or primers (e.g., for use in hybridization methods, amplification 
methods) for example, in methods of detecting FATP-encoding genes, producing 
FATPs, and purifying or isolating FATP-encoding DNA or RNA. Also the subject 
of this invention are antibodies (polyclonal or monoclonal) which bind an FATP or 
FATPs; methods of identifying additional FATP family members (for example, 

25 orthologs of those FATPs described herein by amino acid sequence) and variant 
alleles of known FATP genes; methods of identifying compounds which bind to an 
FATP, or modulate or alter (enhance or inhibit) FATP function; compounds which 
modulate or alter FATP function; methods of modulating or altering (enhancing or 
inhibiting) FATP function and, thus, LCFA uptake into tissues of a mammal (e.g. 

30 human) by administering a compound or molecule (a drug or agent) which increases 
or reduces FATP activity; and methods of targeting compounds to tissues by 
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administering a complex of the compound to be targeted to tissues and a component 
which is bound by an FATP present on cells of the tissues to which the compound is 
to be targeted. For example, a complex of a drug to be delivered to the liver and a 
component which is bound by an FATP present on liver cells (e.g., FATP5) can be 
5 administered. 

In one embodiment, the present invention relates to modulating or altering 
(enhancing or inhibiting/reducing) LCFA uptake in the small intestine and, thus, 
increasing or reducing the number of calories in the form of fats available to an 
individual. In another embodiment, the present invention relates to inhibiting or 

10 reducing LCFA uptake in the small intestine in order to reduce circulating fatty acid 
levels; that is, LCFA uptake in the small intestine is reduced and, therefore, 
circulating (blood) levels are not as high as they otherwise would be. FATP4 has 
been shown to be expressed in epithelial cells of the small intestine and particularly 
in the brush border layer of the small intestine. FATP2 has also been shown to be 

15 expressed at low levels in epithelial cells of the small intestine, particularly in the 
duodenum. In contrast, FATP1, FATP3, FATP5 and FATP6 were not detected in 
any of the intestinal tissues. Thus, also described herein are FATPs which are 
present in the epithelial cell layer of the small intestine where they mediate LCFA 
uptake. These FATPs, particularly FATP4 and also FATP2, are targets for methods 

20 and drugs which block their function or activity and are useful in treating obesity, 
diabetes and heart disease. The ability of these FATPs to mediate fat uptake can be 
modulated or altered (enhanced or inhibited), thus modulating fat uptake in the small 
intestine. This can be done, for example, by administering to an individual, such as 
a human or other animal, a drug which blocks interaction of LCFAs with FATP4 

25 and/or FATP2 in the small intestine, thus inhibiting LCFA passage into the cells of 
the small intestine. As a result, fat absorption is reduced and, although the 
individual has consumed a certain quantity of fat, the LCFAs are not absorbed to the 
same extent they would have been in the absence of the compound administered. 
Thus, one embodiment of this invention is a method of reducing LCFA 

30 uptake (absorption) in the small intestine and, as a result, reducing caloric uptake in 
the form of fat. A further embodiment is a compound (drug) useful in inhibiting or 
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reducing fat absorption in the small intestine. In another embodiment, the invention 
is a method of reducing circulating fatty acid levels by administering to an individual 
a compound which blocks interactions of LCFAs with FATP4 and/or FATP2 in the 
small intestine, thus inhibiting LCFA passage into cells of the small intestine. As a 
5 result, fatty acids pass into the circulatory system at a diminished level and/or rate, 
and circulating fatty acid levels are lower than they would be in the absence of the 
compound administered. This method is particularly useful for therapy in 
individuals who are at risk for or have hyperlipidemia. That is, it can be used to 
prevent the occurrence of elevated levels of lipids in the blood or to treat an 

10 individual in whom blood lipid levels are elevated. Also the subject of this 

invention is a method of identifying compounds which alter FATP function (and 
thus, in the case of FATP2 and/or FATP4, alter LCFA uptake in the small intestine). 

In another embodiment, the present invention relates to a method of 
modulating or altering (enhancing or inhibiting) the function of FATP6, which is 

15 expressed at high levels in the heart. A method of inhibiting FATP6 function is 
useful, for example, in individuals with heart disease, such as ischemia, since 
reducing LCFA uptake into heart muscle in an individual who has ischemic heart 
disease, which may be manifested by, for example, angina or heart attack, can reduce 
symptoms or reduce the extent of damage caused by the ischemia. In this 

20 embodiment, a drug which inhibits FATP6 function is administered to an individual 
who has had or is having a heart attack, to reduce LCFA uptake by the individual's - 
heart and, as a result, reduce the damage caused by ischemia. In a further 
embodiment, this invention is a method of targeting a compound, such as a 
therapeutic drug or an imaging reagent, to heart tissue by administering to an 

25 individual (e.g., a human) a complex of the compound and a component (e.g., a 

LCFA or LCFA-like compound) which is bound by an FATP (e.g., FATP6) present 
in cells of heart tissue. 

In a further embodiment, LCFA uptake by the liver is modulated or altered 
(enhanced or reduced), in an individual. For example, a drug which inhibits the 

30 function of an FATP present in liver (e.g., FATP5) is administered to an individual 
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who is diabetic, in order to reduce LCFA uptake by liver cells and, thus reduce 
insulin resistance. 

The present invention, thus, provides methods which are useful to alter, 
particularly reduce, LCFA uptake in individuals and, as a result, to alter (particularly 
5 reduce), availability of the LCFAs for further metabolism. In a specific 

embodiment, the present invention provides methods useful to reduce LCFA uptake 
and, thus, fatty acid metabolism in individuals, with the result that caloric 
availability from fats is reduced, and circulating fatty acid levels are lower than they 
otherwise would be. These methods are useful, for example, as a means of weight 

10 control in individuals, (e.g., humans) and as a means of preventing elevated serum 
lipid levels or reducing serum lipid levels in humans. FATPs expressed in the small 
intestine, such as FATP4, are useful targets to be blocked in treating obesity (e.g., 
chronic obesity) or to be enhanced in treating conditions in which enhanced LCFA 
uptake is desired (e.g., malabsorption syndrome or other wasting conditions). 

15 The identification of this evolutionarily conserved fatty acid transporter 

family will allow a better understanding of the mechanisms whereby LCFAs traverse 
the lipid bilayer as well as yield insight into the control of energy homeostasis and 
its dysregulation in diseases such as diabetes and obesity. 



BRIEF DESCRIPTION OF THE DRAWINGS 

20 The file of this patent contains at least one color photograph. Copies of this 

patent with color photographs will be provided by the Patent and Trademark Office 
upon request and payment of necessary fee. 

Figure 1 shows the amino acid sequence alignment of FATPs: mmFATPl 
(SEQ ID NO:92), mmFATP2 (SEQ ID NO:93), mmFATP3 (SEQ ID NO:94), 

25 mmFATP4 (SEQ ID NO:95), mrnFATPS (SEQ ID NO:96), ceFATPa (SEQ ID 

NO:97), scFATP (SEQ ID NO:98) and mtFATP (SEQ ID NO:99). The underlining 
(amino acid residues 204-212 of mtFATP) indicates an AMP binding motif which is 
found in many classes of proteins; the underlining at amino acid residues 204-507 of 
the mtFATP sequence indicates the FATP 360 amino acid signature sequence. 
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Figures 2 A-2D show results of LCF A uptake assays. Figures 2 A-2D: COS 
cells were cotransfected using the DEAE-dextran method with the mammalian 
expression vectors pCDNA-CD2 either alone (control; Figure 2A) or in combination 
with one of the FATP-containing expression vectors (pCDNA-mmFATPl, Figure 
5 2B; pCDNA-mmFATP2, Figure 2C; or pCMV-SPORT2-mmFATP5, Figure 2D) as 
described in Materials and Methods for Example 2. COS cells were gated on 
forward scatter (FSC) and side scatter (SS) 5 and the results shown represent > 10,000 
cells. Cells exhibiting >300 CD2 fluorescence units (vertical line) representing 15% 
of all cells were deemed CD2 positive, 

10 Figure 3 is a graph of fluorescence of cells expressing a FATP gene. As in 

Figures 2A-2D, COS cells were cotransfected with pCDNA-CD2 either alone 
(control) or in combination with one of the FATP-containing expression vectors 
(pCDNA-mmFATP 1 , pCDNA-mmFATP2, pCMV-SPORT2-mmFATP5, or 
pCDNA-ceF ATPb) . The mean BODIPY-FA fluorescence of the CD2-positive cells 

15 is plotted; results shown represent the average of three experiments, each consisting 
of greater than 50,000 COS cells. Note that a logarithmic scale is used on the 
ordinate. 

Figure 4 is a graph of the uptake of palmitate with time. The full-length 
coding region of mtFATP (squares) or a control protein (TFE3; circles) was 

20 subcloned into the inducible, prokaryotic expression vector pET (Novagen, 

Madison, WT). Expression from the resulting plasmid was induced (solid symbols) 
in transformed E. colt cells with 1 mM isopropyl-P-D-thiogalactoside (IPTG) for 1 
hour, or cells were left uninduced (open symbols). Data points were done in 
triplicate and counts were normalized to the number of bacteria as determined by 

25 OD 60(> . 

Figure 5 is a phylogenetic tree produced by aligning complete and partial 
sequences for FATP genes from human, rat, mouse, puffer fish, D. melanogaster, C 
elegans, S. cerevisiae, and M tuberculosis using ClustalX and using these data to 
produce a phylogenetic tree using TreeViewPPC The bar indicates the number of 
30 substitutions per residue, i.e., 0.1 corresponds to a distance of 10 substitutions per 
100 residues. 
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Figure 6 shows a comparison of the FATP signature sequences of mmFATPl 
(SEQ ID NO:l), mmFATP5, (SEQ ID NO:2), ceFATPa (SEQ ID NO:3), scFATP 
(SEQ ID NO:4) and mtFATP (SEQ ID NO:5). 

Figure 7 shows the sequence identity among the FATP family members and 
5 VLACs, based on the 360 amino acid signature sequence of FATP from Figure 1. 

Figures 8A and 8B are the mmFATP3 DNA sequence (SEQ ID NO:6). 

Figure 9 is the mmFATP3 protein sequence (SEQ ID NO:7). 

Figures 10A and 10B are the mmFATP4 DNA sequence (SEQ ID NO:8). 

Figure 1 1 is the mmFATP4 protein sequence (SEQ ID NO:9). 
10 Figures 12A and 12B are the mmFATP5 DNA sequence (SEQ ID NO:l 0). 

Figure 13 is the mmFATPS protein sequence (SEQ ID NO: 11). 

Figures 14A and 14B are the hsFATP2 DNA sequence (SEQ ID NO: 12). 

Figure 15 is the hsFATP2 protein sequence (SEQ ID NO: 13). 

Figures 16A and 16B are the hsFATP3 DNA sequence (SEQ ID NO: 14). 
15 Figure 17 is the hsFATP3 protein sequence (SEQ ID NO:15). 

Figures 18A and 18B are the hsFATP4 DNA sequence (SEQ ID NO:16). 

Figure 19 is the hsFATP4 protein sequence (SEQ ID NO: 17). 

Figures 20A and 20B are the hsFATP5 DNA sequence (SEQ ID NO: 18). 

Figure 21 is the hsFATP5 protein sequence (SEQ ID NO: 19). 
20 Figures 22A and 22B are the hsFATP6 DNA sequence (SEQ ID NO:20). 

Figure 23 is the hsFATP6 protein sequence (SEQ ID NO:21). 

Figures 24A and 24B are the mtFATP DNA sequence (SEQ ID NO:22). 

Figure 25 is the mtFATP protein sequence (SEQ ID NO:23). 

Figure 26 shows the DNA sequence (SEQ ID NO:24) and predicted amino 
25 acid sequence (SEQ ID NO:25) of human FATP1. 

Figure 27 shows the DNA sequence (SEQ ID NO:26) and predicted amino 
acid sequence (SEQ ID NO:27) of human FATP4. 

Figure 28 A is a hydrophobicity plot for hsFATPl, showing that it has 
multiple membrane-spanning domains. 
30 Figure 28B is the amino acid composition of hsFATPl. 
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Figure 28C is a hydrophilicity plot for hsFATPl, made using the Kyte- 
Doolittle method, averaging hydrophilicity values for 18 amino acid residues at a 
time. 

Figure 29 A is a hydrophobicity plot for hsFATP4, showing that it has 
5 multiple membrane-spanning domains. 

Figure 29B is a listing of the amino acid composition of hsFATP4. 
Figure 29C is a hydrophilicity plot for hsFATP4, made using the Kyte- 
Doolittle method, averaging hydrophilicity values for 18 amino acid residues at a 
time. 

10 Figures 30A and 3 OB show a comparison of the nucleotide sequence of 

human FATP1 (SEQ ID NO:28) and the nucleotide sequence of mouse FATP1 
(SEQ ID NO:29), 

Figures 31 A and 3 IB show a comparison of the nucleotide sequence of 
human FATP4 (SEQ ID NO:30) and the nucleotide sequence of mouse FATP4 
15 (SEQIDNO:31). 

Figure 32 shows a comparison of the amino acid sequence of human FATP1 
(SEQ ID NO:32) and the amino acid sequence of mouse FATP1 (SEQ ID NO:33). 
Shaded amino acid residues match the consensus sequence exactly. 

Figure 33 shows a comparison at the amino acid level of human FATP4 
20 (SEQ ID NO:34) and mouse FATP4 (SEQ ID NO:35). Shaded amino acid residues 
match the consensus sequence exactly. 

Figure 34 shows the nucleotide sequence (SEQ ID NO:36) and predicted 
amino acid sequence (SEQ ID NO:37) of hsFATP6. 

Figure 35 A is a hydrophobicity plot for hsFATP6, showing that it has 
25 multiple membrane-spanning domains. 

Figure 35B is a listing of the amino acid composition of hsFATP6. 

Figure 35C is a hydrophilicity plot for hsFATP6, made using the Kyte- 
Doolittle method, averaging hydrophilicity values for 18 amino acid residues at a 
time. 
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Figure 36 shows an alignment of the amino acid sequences of hsFATPl 
(SEQ ID NO:38), hsFATP4 (SEQ ID NO:39) and hsFATP6 (SEQ ID NO:40). 
Shaded amino acid residues match the consensus sequence exactly. 

Figure 37 shows results of assessment of fatty acid uptake by human FATP1 
5 and human FATP4. The percent of CD2-positive cells exhibiting a BODIPY- 
fluorescence of more than 300 arbitrary units is plotted for the three different 
conditions tested. 

Figure 38 is a graph showing uptake of tritiated oleate, with time, by 293 
cells transfected with either (diamonds) a plasmid for expression of human FATP4 
10 or (squares) a control plasmid. 

Figure 39 is an illustration of the amino acid sequences of human FATP4 
(SEQ ID NO:41) and mouse FATP4 (SEQ ID NO:42) compared to human FATP1 
(SEQ ID NO:43). Shown by underlining are the FATP consensus sequence (236- 
556 of hsFATPl) and the AMP-binding motif (246-254 of hsFATPl). The human 
15 FATPs were cloned by screening libraries with sequences from ESTs (expressed 
sequence tags). Mouse FATP4 was cloned by PCR using degenerate primers. 

Figure 40 is a graph showing the uptake, with time, of tritiated oleate by 
mouse enterocytes in the presence of no oligonucleotide (squares), sense 
oligonucleotide (circles) or antisense oligonucleotide (diamonds). 
20 Figure 41 is a bar graph showing uptake of tritiated oleate, by mouse 

enterocytes in the presence of various concentrations of antisense (solid bars), 
mismatch (stippled bars) or sense (lined bars) oligonucleotides. 

Figure 42 is a bar graph showing uptake of tritiated oleate and uptake of 35 S- 
labeled methionine by mouse enterocytes to which were added no oligonucleotide, 
25 the antisense oligonucleotide, or the mismatch oligonucleotide. 

Figure 43A is the nucleotide sequence of the gene encoding mouse FATP4 
(SEQIDNO:44). 

Figure 43B is the amino acid sequence of mouse FATP4 protein (SEQ ID 
NO:45). 

30 Figures 44A, 44B, and 44C are the hsFATPl DNA sequence (SEQ ID 

NO:46). Coding region: 175-2115 (1941 nt). 
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Figure 45 is the hsFATPl protein sequence (SEQ ID NO:47). 

Figures 46A and 46B are the hsFATP2 DNA sequence (SEQ ID NO:48). 
Coding region: 223-2085 (1863 nt). 

Figure 47 is the hsFATP2 protein sequence (SEQ ID NO:49). 
5 Figure 48 is the partial DNA sequence of hsFATP3 (SEQ ID NO:50). 

Coding region: 1-993. 

Figure 49 is the partial protein sequence of hsFATP3 (SEQ ID NO:51). 

Figures 50A, 50B, and 50C are the hsFATP4 DNA sequence (SEQ ID 
NO:52). Coding region: 208-2 1 39 (1932 nt). 
10 Figure 51 is the hsFATP4 protein sequence (SEQ ID NO:53). 

Figure 52 is the hsFATP5 partial DNA sequence (SEQ ID NO:54). Coding 
region: 1-1062. 

Figure 53 is the hsFATP5 partial protein sequence (SEQ ID NO:55). 
Figures 54A, 54B, and 54C are the hsFATP6 DNA sequence (SEQ ID 
15 NO:56). Coding region: 643-2502 (1860 nt). 

Figure 55 is the hsFATP6 protein sequence (SEQ ID NO:57). 
Figures 56A, 56B, and 56C are the rnFATPl DNA sequence (rn=Rattus 
norvegicus; (SEQ ID NO:58). Coding region: 75-2015 (1941 nt). 

Figure 57 is the rnFATPl protein sequence (SEQ ID NO:59). 
20 Figures 58A, 58B, and 58C are the rnFATP2 DNA sequence (SEQ ID 

NO:60). Coding region: 795-2657 (1863 nt). 

Figure 59 is the mFATP2 protein sequence (SEQ ID NO:61). 
Figures 60 A and 60B are the mFATP4 partial DNA sequence (SEQ ID 
NO:62). Coding region: 1-1218. 
25 Figure 61 is the mFATP4 partial DNA sequence (SEQ ID NO:63). 

Figures 62A, 62B, and 62C are the mmFATPl DNA sequence (SEQ ID 
NO:64). Coding region: 1-1944. 

Figure 63 is the mmFATPl protein sequence (SEQ ID NO:65). 
Figures 64A and 64B are the mrnFATP2 DNA sequence (SEQ ID NO:66). 
30 Coding region: 121-1992 (1872 nt). 

Figure 65 is the mmFATP2 protein sequence (SEQ ID NO:67). 
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Figures 66A and 66B are the mmFATP3 partial DNA sequence (SEQ ID 
NO:68). Coding region: 1-1830. 

Figure 67 is the mmFATP3 partial protein sequence (SEQ ID NO:69). 

Figures 68A, 68B, and 68C are the mmFATP4 DNA sequence (SEQ ID 
5 NO:70). Coding region: 1-1932. 

Figures 69 is the mmFATP4 protein sequence (SEQ ID NO:71). 

Figures 70A and 70B are the mmFATP5 DNA sequence (SEQ ID NO:72). 
Coding region: 60-2129. 

Figure 71 is the mmFATPS protein sequence (SEQ ID NO:73). 
1 0 Figures 72 A and 72B are the dmFATP partial DNA sequence 

(dm=Drosophila melanogaster; SEQ ID NO: 74). Coding region: 1-1773. 

Figure 73 is the dmFATP partial protein sequence (SEQ ID NO:75). 

Figure 74 is the drFATP partial DNA sequence (dx^Danio rerio, zebrafish; 
SEQIDNO:76) Coding region: 1-173. 
1 5 Figure 75 is the drFATP partial protein sequence (SEQ ID NO:77). 

Figure 76A and 76B are the ceFATPa DNA sequence (SEQ ID NO:78). 
Coding region: 1-1953, 

Figure 77 is the ceFATPa protein sequence (SEQ ID NO:79). 

Figures 78 A and 78B are the ceFATPb DNA sequence (SEQ ID NO: 80). 
20 Coding region: 1-1968. 

Figure 79 is the ceFATPb protein sequence (SEQ ID NO:81). 

Figures 80A and 80B are the chFATP DNA sequence (SEQ ID NO:82; ' 
ch=CochIioboIu heterostrophus). Coding region: 1-1932. 

Figure 81 is the chFATP protein sequence (SEQ ID NO:83). 
25 Figure 82 is the anFATP partial protein sequence (an=Aspergillus nidulans; 

SEQIDNO:84). Coding region: 1-597. 

Figure 83 is the anFATP partial protein sequence (SEQ ID NO:85). 

Figure 84 is the mgFATP partial DNA sequence (mg= Magnaporthe grisea, 
rice blast; SEQ ID NO:86). Coding region: 1-522. 
30 Figure 85 is the mgFATP partial protein sequence (SEQ ID NO:87). 
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Figures 86A and 86B are the scFATP DNA sequence (SEQ ID NO:88). 
Coding region: 1-1872. 

Figure 87 is the scFATP protein sequence (SEQ ID NO:89). 
Figures 88A and 88B are the mtFATP DNA sequence (SEQ ID NO:90). 
5 Figure 89 is the mtFATP protein sequence (SEQ ID NO:91). Coding region: 

1-1794. 

Figure 90 is a consensus sequence of the FATT signature sequence (SEQ ID 
NO: 100), based on 23 independent sequences aligned in ClustalX. The height of the 
bar at each amino acid residue position indicates the degree of conservation at that 
10 position. Gaps have been inserted to maintain the strength of the alignment. 

Figure 91 is a hydrophilicity plot for hsFATP2, made using the Kyte-Doolittle 
method, averaging hydrophilicity values for 18 amino acid residues at a time. 

Figure 92 is a hydrophilicity plot for the hsFATP3 partial protein, made using 
the Kyte-Doolittle method, averaging hydrophilicity values for 18 amino acid 
15 residues at a time. 

Figure 93 is a hydrophilicity plot for the hsFATPS partial protein, made using 
the Kyte-Doolittle method, averaging hydrophilicity values for 18 amino acid 
residues at a time. 

Figures 94A and 94B are a representation of the DNA sequence (SEQ ID 
20 NO: 101) of the hsFATP3 gene, and the amino acid sequence (SEQ ID NO:102) of the 
hsFATP3 protein. 

Figure 95 shows that mammalian expression constructs containing either 
hsFATP4 (squares and triangles) or empty control vector (circles) were stably 
transfected into 293 cells. Short-term uptake of Bodipy-FA in the presence of BSA 

25 was detennined by FACS. The mean fluorescence of the viable cell population is 
expressed in arbitrary fluorescence units. FATP4 protein expression was detennined 
by densitometry of anti-FATP4 Western blots, and is expressed in arbitrary units. 

Figure 96 is a bar graph illustrating short-term uptake of Bodipy-palmitate (1 
\iM), either by control cells (black bars) or FATP4-expressing cells (hatched bars), 

30 was measured in the presence of 0, 10, 100 \iU unlabeled palmitate. FA uptake was 
quantified by FACS and expressed in arbitrary fluorescence units. 
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Figure 97 shows the rate of [ 2 H]palmitate uptake by 293 cells, which were 
stably transfected with a construct for either human FATP4 (diamonds) or an empty 
vector (circles), compared to that of isolated enterocytes (squares). 

Figure 98 is a bar graph illustrating the results when isolated enterocytes were 
5 incubated for 48h with increasing concentrations of the FATP4 antisense 

oligonucleotide or with 100 JIM of a randomized control oligonucleotide with 
identical nucleotide composition to the FATP4 antisense oligonucleotide. The uptake 
of oleate by the enterocytes was then measured over a 5 min time interval (solid bars). 
In parallel, the levels of FATP4 protein and, as a loading control, P-catenin, were 

10 determined by Western blotting and quantitated using densitometry (hatched bars). 
FA uptake and FATP4 protein levels were normalized to that of untreated cells. The 
averages and standard deviations of 4 independent experiments are shown. 

Figure 99 is a bar graph illustrating the uptake rates of [ 3 H]oleate, 
[ 3 H]palmitate and [ 35 S]methionine by primary enterocytes were measured after 48h 

15 incubation with either 100 |iM FATP4 antisense (solid bars) or 100 |-iM randomized 
control oligonucleotide (hatched bars) and expressed as % of untreated cells. 

Figure 100 is a bar graph illustrating that 8 kb of FATP5 genomic sequence 
SEQ ID NO.: 106 is sufficient for liver specific transcription in vitro, A luciferase 
reporter construct containing 8 kb upstream of the FATP5 initiator methionine was 

20 transfected into various cell lines using calcium phosphate as described in Example 
17. Forty-eight hours after transfection, luciferase activity was measured and 
normalized to p-galactosidase activity. For each cell line, fold induction was 
determined by dividing the relative luciferase activity of the 8 kb construct by that of 
the promoter-less luciferase reporter vector. The data shown represent the mean of 

25 three experiments done in triplicate. Error bars indicate the SEM. 

Figure 101 is a bar graph illustrating deletion analysis of the FATP5 promoter. 
Constructs containing deletions of the FATP5 promoter were transfected into HepG2 
cells, assayed for luciferase activity, and normalized to P-galactosidase (RLU). The 
labels on the vertical axis correspond to the length of the promoter segment as 

30 measured from the initiator methionine. The data shown represents the mean of three 
experiments done in triplicate. Error bars indicate the SEM, 
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Figure 102 is a bar graph illustrating that 271 base pairs upstream of the 
FATP5 initiator methionine are sufficient for liver specific luciferase activity. A 
luciferase reporter construct containing 271 base pairs upstream of the FATP5 
initiator methionine was transfected into various cell lines using calcium phosphate as 

5 described in Methods Example 17. Forty eight hours after transfection, luciferase 
activity was measured and normalized to p-galactosidase activity. For each cell line, 
fold induction was determined by dividing the relative luciferase activity of the -271 
base pair construct by that of the promoter-less luciferase reporter vector. The data 
shown represent the mean of three experiments done in triplicate. Error bars indicate 

10 the SEM. 

Figures 103 A and 103B illustrate mutations of the GC box which abolish 
transcriptional activity. A: Schematic of mutations in the GC box aligned with the 
normal sequence (SEQ ID NO.: 106, SEQ ID NO.: 107, SEQ ID NO.: 108). The GC 
box consensus sequence is underlined. B; Constructs containing 271 base pairs 

1 5 upstream of the FATP5 initiator methionine with the mutations in the GC box 

depicted in part A were transfected into HepG2 cells, assayed for luciferase activity, 
and normalized to P-galactosidase (RLU). The data shown represent the mean of 
three experiments done in triplicate. Error bars indicate the SEM. 

Figure 104 shows a gel shift analysis of the GC box with HepG2 nuclear 

20 extracts. Schematic showing the sequence of the oligonucleotides used in gel shift 
studies. The numbering reflects the distance from the initiator methionine. The two 
pairs of oligonucleotides are indicated by the lines and labeled AF-1 (SEQ ID NO.: 
Ill, SEQ ID NO.: 1 12) and AF-2 (SEQ ID NO.: 109, SEQ ID NO.: 110). 

Figure 105 is a bar graph illustrating that 30bp internal deletions of the 

25 FATP5 promoter identify another region required for luciferase activity in HepG2 
cells. Reporter constructs were transfected into HepG2 cells. Luciferase activity was 
measured and normalized to p-galactosidase activity (RLU). The labels on the 
horizontal axis correspond to the nucleotides that were deleted and the numbering on 
the vertical axis represents the distance from the initiator methionine. The data 

30 shown represent the mean of three experiments done in triplicate. Error bars indicate 
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the SEM. Note that the five fold higher RLU activity in this figure relative to Figures 
101 and 103 is the result of a manufacturer change in the p-galactosidase reagent. 

Figure 106 is a bar graph illustrating that a linker scan of the FATP5 promoter 
identifies two additional elements required for transcription in HepG2 cells. Reporter 
5 constructs were transfected into HepG2 cells. Luciferase activity was measured and 
normalized to P-galactosidase activity (RLU). The labels on the horizontal axis 
correspond to the constructs in part A. The data shown represent the mean of three 
experiments done in triplicate. Error bars indicate the SEM. Please note that the 
lower RLU activity in this figure relative to Figures, 101 and 103 is also the result of a 

10 manufacturer change in the p-galactosidase reagent. 

Figure 107 is a schematic of the FATP5 promoter (SEQ ID NO.: 113). The 
GC box and two motifs identified in the linker scan are boxed and labeled. An arrow 
indicates the translational initiator of the FATP5 protein. The two halves of the 
palindrome contained in the novel motifs and referred to in the discussion are 

15 underlined. 

Figure 108 is a photograph showing FATP2 expression in the mouse gall 
bladder epithelium. 

Figure 109 is a photograph showing FATP2 expression in chimpanzee liver. 

Figure 1 10 is a photograph showing FATP5 expression in chimpanzee liver. 
20 Figures 1 1 1 A and 1 1 IB represent the DNA sequence (SEQ ID NO:l 16) of 

human FATP3. 

Figure 1 12 represents the amino acid sequence (SEQ ID NO:l 17) of human 
FATP3. 

Figure 1 13 is a bar graph showing the results of an experiment comparing 
25 fatty acid transport between cells transfected with SEQ ID NO: 116 and untransfected 
cells. 

Figures 1 14A, 1 14B, 114C and 1 14D represent portions of the amino acid 
sequence of mmFATP4 which were produced as fusion polypeptides in E. coli cells. 
Figure 115 is a schematic illustrating certain components of the fusion 
30 polypeptides depicted in Figures 114A-D. The schematic shows the lipocalin domain 
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as well as other identified motifs and notes the relative location of each in the 
mmFATP4 fusion polypeptide. 

Figure 1 16 is a bar graph illustrating the results of an experiment comparing 
the binding capabilities of the fusion polypeptides shown in Figures 1 14A-D for an 
5 oleate fatty acid. 

Figure 1 17 is a bar graph showing the results of an experiment comparing 
binding of various fatty acids between two of the fusion polypeptides depicted in 
Figure 114A-D. 

Figure 1 18A-G illustrates the consensus sequence of hsFATPl, hsFATP2, 
10 hsFATP3, hsFATP4, hsFATPS and hsFATP6 with the lipocalin domain and AMP- 
binding domain of each noted. 

The foregoing and other objects, features and advantages of the invention will 
be apparent from the following more particular description of preferred embodiments 
of the invention, as illustrated in the accompanying drawings in which like reference 
15 characters refer to the same parts throughout the different views. The drawings are 
not necessarily to scale, emphasis instead being placed upon illustrating the principles 
of the invention. 



DETAILED DESCRIPTION OF THE INVENTION 

As described herein, FATPs are a large evolutionarily conserved family of 

20 proteins that mediate the transport of LCFAs into cells. The family includes proteins 
which are conserved from mycobacteria to humans and exhibit very different 
expression patterns in tissues. Specific embodiments described include FATPs from 
mice, humans, nematodes, fungi and mycobacteria which have been ishown to be 
functional LCFA transporters. The term "fatty acid transport proteins" ("FATPs") as 

25 used herein, refers to the proteins described herein as FATP1, FATP2, FATP3, 

FATP4, FATPS and FATP6, which have been described in one or more species of 
mammals, as well as mtFATP, ceFATP, scFATP, anFATP, mgFATP, and chFATP, 
and other proteins sharing at least about 50% amino acid sequence similarity, 
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preferably at least about 60% sequence similarity, more preferably at least about 70% 
sequence similarity, and still more preferably, at least about 80% sequence similarity, 
and most preferably, at least about 90% sequence similarity in the approximately 360 
amino acid signature sequence. The approximately 360 amino acid FATP signature 

5 sequence is shown in Figure 1. The consensus sequence of the signature sequence is 
shown in Figure 90. The nomenclature used herein to refer to FATPs includes a 
species-specific prefix (e.g., mm, Mus musculus; hs or h, Homo sapiens or human; mt 
M. tuberculosis', dm, Z>. melanogaster\ ce, C elegans; sc, Saccharomyces cerevisiae) 
and a number such that mammalian homologues in different species share the same 

10 number. For example, six human and five mouse FATP genes which are expressed in 
a variety of tissues are described herein and are referred to, respectively, as hsFATPl- 
hsFATP6 and mmFATP 1 -mmF ATP5 ; for example, hsFATP4 and mmFATP4 are the 
human and mouse orthologs. 

Expression patterns of human and mouse FATPs have been assessed and are 

15 described below. Briefly, results of these assessments show that FATPS is a liver- 
specific gene. FATP2 is highly expressed in liver, kidney and gall bladder 
epithelium. Both of these proteins, as well as FATP4 and FATPs from nematodes 
and mycobacteria, have been shown to be functional LCFA transporters. Results 
have also shown that FATP4 mRNA is present at high levels in epithelial cells of two 

20 regions of the small intestine (the jejunum and ileum) and at lower, but significant, 
levels in-a-third region (the duodenum). They further showed that FATP2 mRNA is 
present in epithelial cells of the duodenum at a level similar to that of FATP4 mRNA 
levels, but is present at lower levels in the jejunum and ileum. FATP4 mRNA was 
absent from other cell types of the small intestine and no FATP4 mRNA could be 

25 detected in any cells of the colon. No signals above background could be detected for 
FATP1, FATP3 and FATPS in any of the intestinal tissues. Thus, FATP4 is the 
major FATP in the mouse small intestine, which supports a major role for FATP4 
(along with FATP2 to a lesser extent) in absorption of free fatty acids. hsFATP4 was 
clearly expressed in the jejunum and ileum; expression was absent in the stomach. 

30 This, too, is consistent with a major role for FATP4 in absorption of fatty acids in the 
human gut. Analysis of FATP expression in human tissues, also described in detail 
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below, showed that hsFATP6, which has no mouse ortholog as yet, is expressed at 
high levels in the heart and at low levels in the placenta, but is undetectable in the 
other tissues assessed (Example 9). This is consistent with a major role for FATP6 in 
absorption of fatty acids in the heart. 
5 Analysis of FATP3 expression in murine tissues, also described in detail 

below, showed that expression occurs at detectable levels in liver, spleen, heart, 
kidney, testis, white adipose tissue, exocrine and endocrine pancreatic cells, and also 
in lung tissues. FATP3 is expressed at high levels in type-II pneumocytes, a cell type 
noted for secretion a surfactant, a phospholipid-rich film critical for lung function 

10 (Example 19). 

Long chain fatty acids (LCFAs) are an important energy source for pro- and 
eukaryotes and are involved in diverse cellular processes, such as membrane 
synthesis, intracellular signaling, protein modification, and transcriptional regulation. 
In developed Western countries, human dietary lipids are mainly di- and triglycerides 

15 and account for approximately 40% of caloric intake (Weisburger, J. H. (1997) J. Am. 
Diet Assoc. P7:S16-S23). These lipids are broken down into fatty acids and glycerol 
by pancreatic lipases in the small intestine (Chapus, C, Rovery, M., Sarda, L & 
Verger, R. (1988) Biochimie 70:1223-34); LCFAs are then transported into brush 
border cells, where the majority is re-esterified and secreted into the lymphatic system 

20 as chylomicrons (Green, P.H. & Riley, J.W. (1981) Aust. N.Z.J. Med. 77:84-90). 

Fatty acids are liberated from lipoproteins by the enzyme lipoprotein lipase, which is 
bound to the luminal side of endothelial cells (Scow, R.O. & Blachette-Mackie, E J. 
(1992) Mol Cell Biochem 775:181-191). "Free" fatty acids in the circulation are 
bound to serum albumin (Spector, A.A. (1984) Clin. Physiol Biochem 2:123-134) 

25 and are rapidly incorporated by adipocytes, hepatocytes, and cardiac muscle cells. 

The latter derive 60-90% of their energy through the oxidation of LCFAs (Neely, J.F. 
Rovetto, MJ. & Oram, J.F. (1972) Prog. Cardiovasc. Dis: 75:289-329). Although 
saturable and specific uptake of LCFAs has been demonstrated for intestinal cells, 
hepatocytes, cardiac myocytes, and adipocytes, the molecular mechanisms of LCFA 

30 transport across the plasma membrane have remained controversial (Hui, T.Y. & 

Bemlohr, D.A. (1997) Front Biosci. 75:d222-31-d231; Schaffer, J.E. & Lodish, H.F, 
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(1995) Trends Cardiovasc. Med. 5:218-224). Described herein is a large family of 
_ highly homologous mammalian LCFA transporters which show wide expression, 
including in all tissues relevant to fatty acid metabolism. Further described are novel 
members of this family in other species, including mycobacterial and nematode 
5 FATPs which, like their mammalian counterparts, are functional fatty acid 
transporters. 

The discovery of a diverse but highly homologous family of FATPs is 
reminiscent of the glucose transporter family. In a manner similar to the FATPs, the 
glucose transporters have very divergent patterns of tissue expression (McGowan, 

10 K.M., Long, S.D. & Pekala, P.H. (1995) Pharmacol. Ther. 65:465-505). The FATPs, 
like glucose transporters, may also differ in their substrate specificities, uptake 
kinetics, and hormonal regulation (Thorens, B, (1996) Am. J. Physiol 270:G54l- 
G553). Indeed, the levels of fatty acids in the blood, like those of glucose, can be 
regulated by insulin and are dysregulated in diseases such as noninsulin-dependent 

15 diabetes and obesity (Boden, G. (1997) Diabetes 46:3-10). The underlying 

mechanisms for the regulation of free fatty acid concentrations in the blood are not 
understood, but could be explained by hormonal modulation of FATPs. 

Insulin-resistance is thought to be the major defect in non insulin-dependent 
diabetes mellitus (NIDDM) and is one of the earliest manifestations of NIDDM 

20 (McGarry (1992) Science 258:766-770). Free fatty acids (FFAs) may provide an 
explanation for why obesity is a risk factor for NIDDM. Plasma levels ofFFAs are 
elevated in diabetic patients (Reaven et al. (1988) Diabetes 37: 1020). Elevated 
plasma free fatty acids (FFAs) have been demonstrated to induce insulin-resistance in 
whole animals and humans (Boden (1998) Front. Biosci. 3:D169-D175). This 

25 insulin-resistance is likely mediated by effects ofFFAs on a variety of issues. FFAs 
added to adipocytes in vitro induce insulin resistance in this cell type as evidenced by 
inhibition of insulin-induced glucose transport (Van Epps-Fung et al (1997) 
Endocrinology 138:4338-4345). Rats fed a high fat diet developed skeletal muscle 
insulin resistance as evidenced by a decrease in insulin-induced glucose uptake by 

30 skeletal muscle (Han et al t (1997) Diabetes 46:1761-1767). In addition, elevated 
plasma FFAs increase insulin-suppressed endogenous glucose production in the liver 
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(Boden (1998) Front. Biosci. 3:0169-0175), thus increasing hepatic glucose output. 
It has been postulated that the adverse effects of plasma free fatty acids are due to the 
FFAs being taken up into the cell, leading to an increase in intracellular long chain 
fatty acyl CoA; intracellular long chain acyl Co As are thought to mediate the effects 
5 of FFAs inside the cell. Thus, fatty acid induced insulin-resistance may be prevented 
by blocking uptake of FFAs into select tissues, in particular liver (by blocking FATP2 
and/or FATP5), adipocyte (by blocking FATP1), and skeletal muscle (by blocking 
FATP1). Blocking intestinal fat absorption (by blocking FATP4) is also expected to 
reduce plasma FFA levels and thus improve insulin resistance. 

10 During the pathogenesis of NIDDM insulin-resistance can initially be 

counteracted by increasing insulin output by the pancreatic beta cell. Ultimately, this 
compensation fails, beta cell function decreases and overt diabetes results (McGarry 
(1992) Science 258: 766-770). Manipulating beta cell function is a second point 
where fatty acid transporter blockers may be beneficial for diabetes. While no FATP 

15 homolog has been identified so far that is expressed in the beta cell of the pancreas, 
the data described below suggest the existence of such a transporter and the sequence 
information included herein provides the means to identify such a transporter by 
degenerate PCR, using primers to regions conserved in all FATP family members or 
by low stringency hybridization. It has been demonstrated that exposure of pancreatic 

20 beta-cells to FFAs increases the basal rate of insulin secretion; this in turn leads to a 
decrease in the intracellular stores of insulin, resulting in decreased capacity for 
insulin secretion after chronic exposure (Bollheimer et aL, (1998) J. Clin. Invest 
101:1094-1 101). The effects of FFAs are again likely to be mediated by intracellular 
long chain fatty acyl CoA molecules (Liu et aL, (1998) J. Clin. Invest 101:1 870- 

25 1875). FFAs have also been demonstrated to increase beta cell apoptosis 

(Shimabukuro et al, (1998) Proc. Nat Acad. Set USA 95:2498-2502), possibly 
contributing to the decrease in beta cell numbers in late stage NIDDM. 

Another finding with potentially broad implications is the identification of a 
FATP homologue inM tuberculosis. Tuberculosis causes more deaths worldwide 

30 than any other infectious agent and drug-resistant tuberculosis is re-emerging as a 
problem in industrialized nations (Bloom, B.R. & Small, P.M. (1998) N. Engl J. 
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Med. '555:677-678). Mycobacterium tuberculosis has about 250 enzymes involved in 
fatty acid metabolism, compared with only about 50 in E. colL It has been suggested 
that, living as a pathogen, the mycobacteria are largely lipolytic, rather than lipogenic, 
relying on the lipids within mammalian cells and the tubercle (Cole, S.T. et aL, 
5 Nature 595:537-544 (1998)). The de novo synthesis of fatty acids in Mycobacterium 
leprae is insufficient to maintain growth (Wheeler, P.R., Buhner, K & Ratledge, C. 
(1990) Gene, Microbiol 136:21 1-217). Thus, it is reasonable to expect that 
inhibitors of mtFATP will serve as therapeutics for tuberculosis. FATPs expressed in 
mycobacteria can be targeted to reduce or prevent replication of mycobacteria (e.g., to 

10 reduce or prevent replication of M tuberculosis) and, thus, reduce or prevent their 

adverse effects. For example, a FATP or FATPs expressed by M. tuberculosis can be 
targeted and inhibited, thus reducing or preventing growth of this pathogen (and 
tuberculosis in humans and other mammals). An inhibitor of an M. tuberculosis 
FATP can be identified, using methods described herein (e.g., expressing the FATP 

15 in an appropriate host cell, such as E. coli or COS cells; contacting the cells with an 
agent . or drug to be assessed for its ability to inhibit the FATP and, as a result, 
mycobacterial growth, and assessing its effects on growth). A drug or agent 
identified in this manner can be further tested for its ability to inhibit a M. 
tuberculosis FATP and M. tuberculosis infection in an appropriate animal model or 

20 in humans. A method of inhibiting mycobacterial growth, particularly growth of M 

— tuberculosis, and compounds useful as drugs for doing so are also the subject of this 
invention. 

An isolated polynucleotide encoding mtFATP, like other polynucleotides 
encoding FATPs of the FATP family, can be incorporated into vectors, nucleic acids 

25 of viruses, and other nucleic acid constructs that can be used in various types of host 
cells to produce mtFATP. This mtFATP can be used, as it appears on the surface of 
cells, or in various artificial membrane systems, to assess fatty acid transport 
function, to identify ligands and molecules that are modulators of fatty acid transport 
activity. Molecules found to be inhibitors of mtFATP function can be incorporated 

30 into pharmaceutical compositions to administer to a human for the treatment of 
tuberculosis. 
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Particular embodiments of the invention are polynucleotides encoding a 
FATP of Cochliobolus (Helminthosporium) heterostraphus or portions or variants 
thereof, the isolated or recombinantly produced FATP, methods for assessing whether 
an agent binds to the chFATP, and further methods for assessing the effect of an 
5 agent being tested for its ability to modulate fatty acid transport activity. 

Cochliobolus heterostrophus is an ascomycete that is the cause of southern corn leaf 
blight, an economically important threat to the corn crop in the United States. The 
related species C. sativus causes crown rot and common root rot in wheat and barley. 
One or more FATPs of C. heterostrophus can be targeted for the identification of an 

10 inhibitor of chFATP function, which can be then be used as an agent effective against 
infection of plants by C heterosfrophus and related organisms. Methods described 
herein that were applied in studying the expression of a FATP gene and the function 
of the FATP in its natural site of expression or in a host cell, can be used in the study 
of the chFATP gene and protein. 

15 Magnaporthe grisea (rice blast) is an economically important fungal pathogen 

of rice. Further embodiments of the invention are nucleic acid molecules encoding a 
FATP of Magnaporthe grisea, portions thereof, or variants thereof, isolated 
mgFATP, nucleic acid constructs, and engineered cells expressing mgFATP. Other 
aspects of the invention are assays to identify an agent which binds to mgFATP and 

20 assays to identify an agent which modulates the function of mgFATP in ceils in 

which mgFATP is expressed or in artificial membrane systems. Agents identified as 
inhibiting mgFATP activity can be developed into anti-fungal agents to be used to 
treat rice infected with rice blast. 

Caenorhabditis elegans is a nematode related to plant pathogens and human 

25 parasites. An isolated polynucleotide which encodes ceFATP, like other 

polynucleotides encoding FATPs of the FATP family described herein, can be 
incorporated into nucleic acid vectors and other constructs that can be used in various 
types of cells to produce ceFATP. ceFATP as it occurs in cells or as it can be isolated 
or incorporated into various artificial or reconstructed membrane systems, can be 

30 used to assess fatty acid transport, and to identify ligands and agents that modulate 
fatty acid transport activity. Agents found by such assays to be inhibitors of ceFATP 
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activity can be incorporated into compositions for the treatment of diseases caused by 
genetically related organisms with a FATP of similar sensitivity to the agents. 

Aspergillus nidulans is one of a family of fungal species that can infect 
humans. Further embodiments of the invention of the family of polynucleotides 
5 encoding FATPs are polynucleotides encoding a FATP of Aspergillus nidulans, and 
vectors and host cells that can be constructed to comprise such polynucleotides. 
Further embodiments are a polypeptide encoded by such polynucleotides, portions 
thereof having one or more functions characteristic of a FATP, and various methods. 
The methods include those for identifying agents that bind to a FATP and those for 
10 assessing the effect of an agent being tested for its ability to modulate fatty acid 

transport activity. Those agents found to inhibit fatty acid transport function can be 
used in compositions as anti-fiingal pharmaceuticals, or can be modified for greater 
effectiveness as a pharmaceutical. 

One aspect of the invention relates to isolated nucleic acids that encode a 
15 FATP as described herein, such as those FATPs having an amino acid sequence in 
Figure 45 (SEQ ID NO:47), Figure 47 (SEQ ID NO:49), Figure 1 12 (SEQ ID NO: 
117), Figure 51 (SEQ ID NO:53), Figure 53 (SEQ ID NO:55), and Figure 55 (SEQ 
ID NO:57) and nucleic acids closely related thereto as described herein. 

Using the information provided herein, such as a nucleic acid sequence set 
20 forth in Figures 44A-44C (SEQ ID NO:46), Figures 46A and 46B (SEQ ID NO:48), 
Figure 1 12 (SEQ ID NO: 1 1 6), Figures 50A-50C (SEQ IDNO:52), Figure 52 (SEQ 
ID NO:54), and Figures 54A-54C (SEQ ID NO:56), a nucleic acid of the invention 
encoding a FATP polypeptide has been obtained using standard cloning and 
screening methods, such as those for cloning and sequencing cDNA library 
25 fragments, followed by obtaining a full length clone. For example, to obtain a nucleic 
acid of the invention, a library of clones of cDNA of human or other mammalian 
DNA can be probed with a labeled oligonucleotide, such as a radiolabeled 
oligonucleotide, preferably about 17 nucleotides or longer, derived from a partial 
sequence. Clones carrying DNA identical to that of the probe can then be 
30 distinguished using stringent (also, "high stringency") hybridization conditions. By 
sequencing the individual clones thus identified with sequencing primers designed 
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from the original sequence it is then possible to extend the sequence in both 
directions to determine the full length sequence. Suitable techniques are described, 
for example, in Current Protocols in Molecular Biology (F.M. Ausubel et al, eds), 
containing supplements through Supplement 42, 1998, John Wiley and Sons, Inc., 
5 especially chapters 5, 6 and 7. 

Embodiments of the invention include isolated nucleic acid molecules 
comprising any of the following nucleotide sequences: 1.) a nucleotide sequence 
which encodes a protein comprising the amino acid sequence of hsFATPl (SEQ ID 
. NO:47), the amino acid sequence of hsFATP2 (SEQ ID NO:49), the amino acid 

10 sequence of hsFATP3 (SEQ ID NO:l 17), the amino acid sequence of hsFATP4 (SEQ 
ID NO: 53), the amino acid sequence of hsFATPS (SEQ ID NO:55) or the amino acid 
sequence of hsFATP6 (SEQ ID NO:57); 2.) nucleotide sequences of hsFATPl, 
hsFATP2, hsFATP3, hsFATP4, hsFATPS, or hsFATP6 (SEQ ID NO:46, 48, 1 16, 52, 
54, or 56, respectively); 3.) a nucleotide sequence which is complementary to the 

15 nucleotide sequence of hsFATPl (SEQ ID NO:46), hsFATP2 (SEQ ID NO:48), 

hsFATP3 (SEQ ID NO:l 16), hsFATP4 (SEQ 3D NO:52), hsFATPS (SEQ ID NO:54) 
orhsFATP6(SEQIDNO:56); 4.) a nucleotide sequence which consists of the 
coding region of hsFATPl (SEQ ID NO:46), the coding region of hsFATP2 (SEQ ID 
NO:48), the coding region of hsFATP3 (SEQ ID NO:l 16), the coding region of 

20 hsFATP4 (SEQ ID NO:52), the coding region of hsFATP5 (SEQ ID NO:54), or the 
coding region ofhsFATP6 (SEQ ID NO:56). 

The invention further relates to nucleic acids (nucleic acid molecules or 
polynucleotides) having nucleotide sequences identical over their entire length to 
those shown in the figures, for instance Figures 44A-44C (SEQ ID NO:46), Figures 

25 46A and 46B (SEQ ID NO:48), Figures 1 1 1 A-B (SEQ ID NO:l 16), Figures 50A-50C 
(SEQ ID NO:52), Figure 52 (SEQ ID NO:54), and Figures 54A-54C (SEQ ID 
NO:56). It further relates to DNA, which due to the degeneracy of the genetic code, 
encodes a FATP encoded by one of the FATP-encoding DNAs, whose amino acid 
sequence is provided herein. Also provided by the invention are nucleic acids having 

30 the coding sequences for the mature polypeptides or fragments in reading frame with 
other coding sequences, such as those encoding a leader or secretory sequence, a pre-, 
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or pro- or prepro- protein sequence. The nucleic acids of the invention encompass 
nucleic acids that include a single continuous region or discontinuous regions 
encoding the polypeptide, together with additional regions, that may also contain 
coding or non-coding sequences. The nucleic acids may also contain non-coding 
5 sequences, including, for example, but not limited to, non-coding 5' and 3' sequences, 
such as the transcribed, non-translated sequences, termination signals, ribosome 
binding sites, sequences that stabilize mRNA, introns, polyadenylation signals, and 
additional coding sequences which encode additional amino acids. For example, a 
marker sequence that facilitates purification of the fused polypeptide can be encoded. 

10 In certain embodiments of the invention, the marker sequence can be a hexa-histidine 
peptide, as provided in the pQE vector (Qiagen, Inc., Venlo, The Netherlands) and 
described in Gentz et al, Proa Natl Acad. Sci. USA 86: 821-824 (1989), or an HA 
tag (Wilson et al, Cell 37: 767 (1984)), or a sequence encoding glutathione S- 
transferase of Schistosoma japonicum (vectors available from Pharmacia; see Smith, 

15 D.B. and Johnson K.S., Gene 67:31 (1988) and Kaelin, W.G. etal t Cell 70:351 
(1992)). Nucleic acids of the invention also include, but are not limited to, nucleic 
acids comprising a structural gene and its naturally associated sequences that control 
gene expression. 

The invention further relates to nucleic acids (nucleic acid molecules or 
20 polynucleotides) that encode a FATP polypeptide. In a particular embodiment, a 
nucleic acid encodes a portion of a FATP which includes a motif or domain, for 
example, a lipocalin domain or an AMP-binding domain. Such a polypeptide portion 
can be a functional portion of a FATP protein. The term "lipocalin domain" is an art 
recognized term and as used herein refers to a particular domain present in FATP 
25 proteins. This domain is described as including regions of sequence homology as 
well as a common tertiary structure represented as an eight stranded antiparallel beta- 
barrel, (see Banaszak, L. et al, Advances in Protein Chemistry, 45: 89-151). Many 
lipocalin domains can be identified structurally as a sequence contained within the 
general formula: [DENG]-X-[DENQGSTARK]-X(0,2)-[DENQARK]-[LIVFY]- 
30 {CP}-G- {C}-W-[FYWLRH-X]-[LIVMTA], e.g., the lipocalin signature sequence or 
consensus pattern (SEQ ID NO: 125). One skilled in the art will recognize that a 
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lipocalin domain for a particular FATP protein can vary in sequence from this general 
formula. A FATP lipocalin domain can be, for example, identical to the lipocalin 
signature sequence or can exhibit 60, 65, 70, 75, 80, 85, 90, 95 or greater per cent 
sequence identity compared to the general formula provided that it retains lipocalin 
5 binding function. For example, a lipocalin domain for each of the human FATPs, 
hsFATPl (SEQ ID NO: 126), hsFATP2 (SEQ ID NO: 127), hsFATP3 (SEQ ID NO: 
128), hsFATP4 (SEQ ID NO: 129), hsFATPS (SEQ ID NO: 130), and hsFATP6 
(SEQ ID NO: 131) has been identified. The pattern of these lipocalin domains are 
highly conserved across the FATP family. 

10 A nucleic acid encoding a portion of a FATP polypeptide can encode one or 

more domains, and also can include additional nucleotides. For example, the nucleic 
acid can also include nucleotide sequences that encode a portion of a FATP protein 
that is upstream from a lipocalin domain. As the term "upstream" or "upstream 
sequences" is used herein in relation to the lipocalin domain, it is intended to refer to 

15 the nucleotide sequence which encodes all or a portion of a FATP protein located 
between the signal peptide (when one is present) and the lipocalin domain. In the 
absence of a signal peptide, the term refers to the nucleotide sequence which encodes 
all or a portion of a FATP protein between the lipocalin domain and the amino 
terminus (see Figure 115). 

20 The invention further relates to variants, including naturally-occurring allelic 

variants, of those nucleic acids described specifically herein by DNA sequence, that 
encode variants of such polypeptides as those having the amino acid sequences 
shown in Figure 45 (SEQ ID NO:47), Figure 47 (SEQ ID NO:49), Figure 1 12 (SEQ 
ID NO:l 17), Figure 51 (SEQ ID NO:53) Figure 53 (SEQ ID NO:55), or Figure 55 

25 (SEQ ID NO:57). Such variants include nucleic acids encoding variants of the 

above-listed amino acid sequences, wherein those variants have several, such as 5 to 
10, 1 to 5, or 3, 2 or 1 amino acids substituted, deleted, or added, in any combination. 
Variants include polynucleotides encoding polypeptides with at least 95% but less 
than 100% amino acid sequence identity to the polypeptides described herein by 

30 amino acid sequence. Variant polynucleotides hybridize, under low to high 

stringency conditions, to the alleles described herein by DNA sequence. In one 
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embodiment, variants have silent substitutions, additions and deletions that do not 
alter the properties and activities of the FATP. Allelic variants of the polynucleotides 
encoding hsFATPl (Figure 45; SEQ ID NO:47), hsFATP2 (Figure 47; SEQ ID 
NO:49), hsFATP3 (Figure 112; SEQ ID NO:l 17), hsFATP4 (Figure 51; SEQ ID 
5 NO:53), hsFATPS (Figure 53; SEQ ID NO:55) and hsFATP6 (Figure 55; SEQ ID 
NO:57) will be identified as mapping to chromosomal locations listed for the 
corresponding wild type genes in Table 2 in Example 1 . 

Orthologous genes are gene loci in different species that are sufficiently 
similar to each other in their nucleotide sequences to suggest that they originated 
10 from a common ancestral gene. Orthologous genes arise when a lineage splits into 
two species, rather than when a gene is duplicated within a genome. Proteins that are 
orthologs are encoded by genes of two different species, wherein the genes are said to 
be orthologous. 

The invention further relates to polynucleotides encoding polypeptides which 

15 are orthologous to those polypeptides having a specific amino acid sequence 

described herein, such as the amino acid sequences shown in Figure 45 (SEQ ID 
NO:47), Figure 47 (SEQ ID NO:49), Figure 112 (SEQ ID NO: 11 7), Figure 51 (SEQ 
ID NO:53), Figure 53 (SEQ ID NO:55), or Figure 55 (SEQ ID NO:57). These 
polynucleotides, which can be called ortholog polynucleotides, encode orthologous 

20 polypeptides that can range in amino acid sequence identity to a reference amino acid 
sequence described herein, from about 65% to less than 100%, but preferably 70% to 
80%, more preferably 80% to 90%, and still more preferably 90% to less than 100%, 
Orthologous polypeptides can also be those polypeptides that range in amino acid 
sequence similarity to a reference amino acid sequence described herein from about 

25 75% to 100%, within the signature sequence. The amino acid sequence similarity 

between the signature sequences of orthologous polypeptides is preferably 80%, more 
preferably 90%, and still more preferably, 95%. The ortholog polynucleotides encode 
polypeptides that have similar functional characteristics (e.g., fatty acid transport 
activity) and similar tissue distribution, as appropriate to the organism from which the 

30 ortholog polynucleotides can be isolated. 
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Ortholog polynucleotides can be isolated from (e.g., by cloning or nucleic acid 
amplification methods) a great number of species, as shown by the sample of FATPs 
from evolutionarily divergent species described herein (see, e.g., Figures 44A-C 
through Figure 89). Ortholog polynucleotides corresponding to those in Figure 45 
5 (SEQ ID NO:47), Figure 47 (SEQ ID NO:49), Figures 1 1 1 A-B (SEQ ID NO:l 16), 
Figure 51 (SEQ ID NO:53), Figure 52 (SEQ ID NO:55) and Figure 55 (SEQ ID 
NO:57) are those which can be isolated from mammals such as rat, dog, chimpanzee, 
monkey, baboon, pig, rabbit and guinea pig, for example. 

Further variants that are fragments of the nucleic acids of the invention may 

10 be used to synthesize full-length nucleic acids of the invention, such as by use as 
primers in a polymerase chain reaction. As used herein, the term primer refers to a 
single-stranded oligonucleotide which acts as a point of initiation of template- 
directed DNA synthesis under appropriate conditions (e.g., in the presence of four 
different nucleoside triphosphates and an agent for polymerization, such as DNA or 

15 RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable 
temperature. The appropriate length of a primer depends on the intended use of the 
primer, but typically ranges from 1 5 to 30 nucleotides. Short primer molecules 
generally require cooler temperatures to form sufficiently stable hybrid complexes 
with the template. A primer need not reflect the exact sequence of the template, but 

20 must be sufficiently complementary to hybridize with a template. The term primer 
site refers to the area of the target DNA to which a primer hybridizes. The term 
primer pair refers to a set of primers including a 5 ! (upstream) primer that hybridizes 
with the 5* end of the DNA sequence to be amplified and a 3' (downstream) primer 
that hybridizes with the complement of the 3' end of the sequence to be amplified. 

25 Further embodiments of the invention are nucleic acids that are at least 80% 

identical over their entire length to a nucleic acid described herein, for example a 
nucleic acid having the nucleotide sequence in Figures 44A-44C (SEQ ID NO:46), 
Figures 46A-46B (SEQ ID NO:48), Figures 1 1 1 A-B (SEQ ID NO: 1 1 6), Figures 50A- 
50C (SEQ ID NO:52), Figure 52 (SEQ ID NO:54), and Figures 54A-54C (SEQ ID 

30 NO:56). Additional embodiments are nucleic acids, and the complements of such 
nucleic acids, having at least 90% nucleotide sequence identity to the above- 
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described sequences, and nucleic acids having at least 95% nucleotide sequence 
identity. In preferred embodiments, DNA of the present invention has 97% 
nucleotide sequence identity, 98% nucleotide sequence identity, or at least 99% 
nucleotide sequence identity with the DNA whose sequences are presented herein. 

5 Other embodiments of the invention are nucleic acids that are at least 80% 

identical in nucleotide sequence to a nucleic acid encoding a polypeptide having an 
amino acid sequence as set forth in Figure 45 (SEQ ID NO:47), Figure 47 (SEQ ID 
NO:49), Figure 112 (SEQ ID NO:l 17), Figure 51 (SEQ ID NO:53), Figure 53 (SEQ 
ID NO:55) or Figure 55 (SEQ ID NO:57), or as such amino acid sequences are set 

10 forth elsewhere herein, and nucleic acids that are complementary to such nucleic 
acids. Specific embodiments are nucleic acids having at least 90% nucleotide 
sequence identity to a nucleic acid encoding a polypeptide having an amino acid 
sequence as described in the list above, nucleic acids having at least 95% sequence 
identity, and nucleic acids having at least 97% sequence identity. 

15 The tenns "complementary" or "complementarity" as used herein, refer to the 

natural binding of polynucleotides under permissive salt and temperature conditions 
by base-pairing. Complementarity between two single-stranded molecules may be 
"partial" in which only some of the nucleic acids bind, or it may be complete when 
total complementarity exists between the single-stranded molecules (that is, when A- 

20 T and G-C base pairing is 100% complete). The degree of complementarity between 
nucleic acid strands has significant effects on the efficiency and strength of 
hybridization between nucleic acid strands. This is of particular importance in 
amplification reactions, which depend on binding between nucleic acid strands. 
The invention further includes nucleic acids that hybridize to the above- 

25 described nucleic acids, especially those nucleic acids that hybridize under stringent 
hybridization conditions. "Stringent hybridization conditions" or "high stringency 
conditions" generally occur within a range from about T m minus 5°C (5° C below the 
strand dissociation temperature or melting temperature (T m ) of the probe nucleic acid 
molecule) to about 20° C to 25° C below T m . As will be understood by those of skill 

30 in the art, the stringency of hybridization may be altered in order to identify or detect 
molecules having identical or related polynucleotide sequences. An example of high 
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stringency hybridization follows. Hybridization solution is (6x SSC/10 mM 
EDTA/0.5% SDS/5x Denhardt's solution/100 \lg/m\ sheared and denatured salmon 
sperm DNA). Hybridization is at 64-65 °C for 16 hours. The hybridized blot is 
washed two times with 2x SSC/0.5% SDS solution at room temperature for 15 
5 minutes each, and two times with 0.2x SSC/0.5% SDS at 65°C 5 for one hour each. 
Further examples of high stringency conditions can be found on pages 2.10.1-2.10.16 
(see particularly 2.10.8-1 1) and pages 6.3.1-6 in Current Protocols in Molecular 
Biology (Ausubel, F.M. et aL y eds., containing supplements up through Supplement 
42, 1998). Examples of high, medium, and low stringency conditions can be found 

10 on pages 36 and 37 of WO 98/40404, which are incorporated herein by reference. 

The invention further relates to nucleic acids obtainable by screening an 
appropriate library with a probe having a nucleotide sequence such as that set forth in 
Figures 44A-44C (SEQ ID NO:46), Figures 46A-46B (SEQ ID NO:48), Figure 111 
(SEQ ID NO:l 16), Figures 50A-50C (SEQ ID NO:52), Figure 52 (SEQ ID NO:54) or 

15 Figures 54A-54C (SEQ ID NO:56), or a probe which is a sufficiently long fragment 
of any of the above; and isolating the nucleic acid. Such probes generally can 
comprise at least 15 nucleotides. Nucleic acids obtainable by such screenings may 
include RNAs, cDNAs and genomic DNA, for example, encoding FATPs of the 
FATP family described herein. 

20 Further uses for the nucleic acid molecules of the invention, whether encoding 

a full-length FATP or whether comprising a contiguous portion of a nucleic acid 
molecule such as one given in SEQ ID NO:46, 48, 116, 52, 54, or 56, include use as 
markers for tissues in which the corresponding protein is preferentially expressed (to 
identify constitutively expressed proteins or proteins produced at a particular stage of 

25 tissue differentiation or stage of development of a disease state); as molecular weight 
markers on southern gels; as chromosome markers or tags (when labeled, for example 
with biotin, a radioactive label or a fluorescent label) to identify chromosomes or to 
map related gene positions; to compare with endogenous DNA sequences in a 
mammal to identify potential genetic disorders; as probes to hybridize and thus 

30 identify, related DNA sequences; as a source of information to derive PCR primers 
for genetic fingerprinting; as a probe to "subtract-out" known sequences in the 
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process of discovering other novel nucleic acid molecules; for selecting and making 
oligomers for attachment to a "gene chip" or other support, to be used, for example, 
for examination of expression patterns; to raise anti-protein antibodies using DNA 
immunization techniques; and as an antigen to raise anti-DNA antibodies or to elicit 
5 another immune response. 

In certain embodiments, a contiguous portion can be about 1 5, 25, 30, 40, 50, 
75, 100, 200, 300, 400, 500, 750, 1000, 1 100, 1250, 1500 or more nucleotides in 
length. In a particular embodiment, the contiguous portion encompasses the 
signature sequence of a FATP and is about 1080 nucleotides in length. 

10 Further methods to obtain nucleic acids encoding FATPs of the FATP family 

include PCR and variations thereof (e.g., "RACE" PCR and semi-specific PCR 
methods). Portions of the nucleic acids having a nucleotide sequence set forth in 
Figures 44A-44C (SEQ ID NO:46), Figures 46A-46B (SEQ ID NO:48), Figures 
1 1 1 A-B (SEQ ID NO:l 16), Figures 50A-50C (SEQ ID NO:52), Figure 52 (SEQ ID 

15 NO:54) or Figures 54A-54C (SEQ ID NO:56), (especially "flanking sequences" on 
either side of a coding region) can be used as primers in methods using the 
polymerase chain reaction, to produce DNA from an appropriate template nucleic 
acid. 

Once a fragment of the FATP gene is generated by PCR, it can be sequenced, 
20 and the sequence of the product can be compared to other DNA sequences, for 
example, by using the BLAST Network Service at the National Center for 
Biotechnology Information. The boundaries of the open reading frame can then be 
identified using semi-specific PCR or other suitable methods such as library 
screening. Once the 5 5 initiator methionine codon and the 3' stop codon have been 
25 identified, a PCR product encoding the full-length gene can be generated using 

genomic DNA as a template, with primers complementary to the extreme 5' and 3' 
ends of the gene or to their flanking sequences. The full-length genes can then be 
cloned into expression vectors for the production of functional proteins. 

The invention also relates to isolated proteins or polypeptides such as those 
30 encoded by nucleic acids of the present invention. Isolated proteins can be purified 
from a natural source or can be made recombinantly. Proteins or polypeptides 
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refen-ed to herein as "isolated" are proteins or polypeptides that exist in a state 
different from the state in which they exist in cells in which they are normally 
expressed in an organism, and include proteins or polypeptides obtained by methods 
described herein, similar methods or other suitable methods, and also include 
5 essentially pure proteins or polypeptides, proteins or polypeptides produced by 
chemical synthesis or by combinations of biological and chemical methods, and 
recombinant proteins or polypeptides which are isolated. Thus, the term "isolated" as 
used herein, indicates that the polypeptide in question exists in a physical milieu 
distinct from that in which it occurs in nature. Thus, "isolated" includes existing in 

10 membrane fragments and vesicles membrane fractions, liposomes, lipid bilayers and 
other artificial membrane systems. An isolated FATP may be substantially isolated 
with respect to the complex cellular milieu in which it naturally occurs, and may even 
be purified essentially to homogeneity, for example as determined by PAGE or 
column chromatography (for example, HPLC), but may also have further cofactors or 

15 molecular stabilizers, such as detergents, added to the purified protein to enhance 
activity. In one embodiment, proteins or polypeptides are isolated to a state at least 
about 75% pure; more preferably at least about 85% pure, and still more preferably at 
least about 95% pure, as determined by Coomassie blue staining of proteins on SDS- 
polyacrylamide gels. Proteins or polypeptides referred to herein as "recombinant" are 

20 proteins or polypeptides produced by the expression of recombinant nucleic acids. 

In a preferred embodiment, an isolated polypeptide comprising a FATP, a 
functional portion thereof, or a functional equivalent of the FATP, has at least one 
function characteristic of a FATP, for example, transport activity, binding function 
(e.g., a domain which binds to AMP), or antigenic function (e.g., binding of 

25 antibodies that also bind to a naturally-occurring FATP, as that function is found in 
an antigenic determinant). Functional equivalents can have activities that are 
quantitatively similar to, greater than, or less than, the reference protein. These 
proteins include, for example, naturally occurring FATPs that can be purified from 
tissues in which they are produced (including polymorphic or allelic variants), 

30 variants (e.g., mutants) of those proteins and/or portions thereof. Such variants 
include mutants differing by the addition, deletion or substitution of one or more 
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amino acid residues, or modified polypeptides in which one or more residues are 
modified, and mutants comprising one or more modified residues. Portions or 
fragments of a FATP can range in size from four amino acid residues to the entire 
amino acid sequence minus one amino acid and include contiguous portions or 
5 fragments about 4, 5, 6, 7, 8, 9, 10, 15, 25, 30, 40, 50, 75, 100, 150, 200, 300, 400, 
500, 600 or more amino acid residues in length. In one particular embodiment, the 
portion or fragment includes the signature sequence of a FATP polypeptide and is 
about 360 amino acid residues in length. 

The isolated proteins of the invention preferably include mammalian fatty 

10 acid transport proteins of the FATP family of homologous proteins. In one 

embodiment, the extent of amino acid sequence similarity between a polypeptide 
having one of the amino acid sequences shown in Figure 45 (SEQ ID NO:47), Figure 
47 (SEQ ID NO:49), Figure 1 12 (SEQ ID NO:l 17), Figure 51 (SEQ ID NO:53), 
Figure 53 (SEQ ID NO:55), or Figure 55 (SEQ ID NO:57), and the respective 

15 functional equivalents of these polypeptides is at least about 88%. In other 

embodiments, the degree of amino acid sequence similarity between a FATP and its 
respective functional equivalent is at least about 91%, at least about 94%, or at least 
about 97%. 

The polypeptides of the invention also include those FATPs encoded by 
20 polynucleotides which are ortholbgous to those polynucleotides, the sequences of 
which are described herein in whole or in part. FATPs which are orthologs to those 
described herein by amino acid sequence, in whole or in part, are, for example, fatty 
acid transport proteins 1-6 of dog, rat, chimpanzee, monkey, rabbit, guinea pig, 
baboon and pig, and are also embodiments of the invention. 
25 To determine the percent identity or similarity of two amino acid sequences or 

of two nucleic acid sequences, the sequences are aligned for optimal comparison 
purposes (e.g., gaps can be introduced in one or both of a first and a second amino 
acid or nucleic acid sequence for optimal alignment, and non-homologous 
(dissimilar) sequences can be disregarded for comparison purposes). In a preferred 
30 embodiment, the length of a reference sequence aligned for comparison purposes is at 
least 30%, preferably at least 40%, more preferably at least 50%, even more 
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preferably at least 60%, and even more preferably at least 70%, 80%, or 90% of the 
length of the reference sequence. The amino acid residues or nucleotides at 
corresponding amino acid positions or nucleotide positions are then compared. When 
a position in the first sequence is occupied by the same amino acid residue or 
5 nucleotide as the corresponding position in the second sequence, then the molecules 
are identical at that position (as used herein, amino acid or nucleic acid "identity" is 
equivalent to amino acid or nucleic acid "similarity"). The percent identity between 
the two sequences is a function of the number of identical positions shared by the 
sequences, taking into account the number of gaps, and the length of each gap, which 

10 need to be introduced for optimal alignment of the two sequences. 

The invention also encompasses polypeptides having a lower degree of 
identity but having sufficient similarity so as to perform one or more of the same 
functions performed by the polypeptides described herein by amino acid sequence. 
Similarity for a polypeptide is determined by conserved amino acid substitution. 

15 Such substitutions are those that substitute a given amino acid in a polypeptide by 
another amino acid of like characteristics. Conservative substitutions are likely to be 
phenotypically silent. Typically seen as conservative substitutions are the 
replacements, one for another, among the aliphatic amino acids Ala, Val, Leu, and 
lie; interchange of the hydroxyl residues Ser and Thr, exchange of the acidic residues 

20 Asp and Glu, substitution between the amide residues Asn and Gin, exchange of the 
basic residues Lys and Arg and replacements among the aromatic residues Phe and 
Tyr. Guidance concerning which amino acid changes are likely to be phenotypically 
silent is found in Bowie et al> Science 247:1306-1310 (1990). 
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TABLE 1 . Conservative Amino Acid Substitutions 



Aromatic 




Phenylalanine 








Tryptophan 








Tyrosine 




Hydrophobic 




Leucine 








Isoleucine 








Valine 




Polar 




Glutamine 








Asparagine 




Basic 




Arginine 








Lysine 








Histidine 




Acidic 




Aspartic Acid 








Glutamic Acid 




Small 




Alanine 








Serine 








Threonine 








Methionine 








Glycine 





The comparison of sequences and determination of percent identity and 
similarity between two sequences can be accomplished using a mathematical 

1 0 algorithm. {Computational Molecular Biology, Lesk, A.M.,ed M Oxford University 
Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, 
D.W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, 
Part 7, Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 1994; 
Sequence Analysis in Molecular Biology, vonHeinje, G M Academic Press, 1987; and 

15 Sequence Analysis Primer, Gribskov, M. and Devereaux, J., eds., M. Stockton 
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Press, New York, 1991). In a preferred embodiment, the percent identity between 
two amino acid sequences is determined using the Needleman and Wunsch {J. Mol 
Biol (48):444-453 (1970)) algorithm which has been incorporated into the GAP 
program in the GCG software package (available at http://www.gcg.com)> using 
5 either a Blossom 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 
8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred 
embodiment, the percent identity between two nucleotide sequences is determined 
using the GAP program in the GCG software package (Devereux, J., et al, Nucleic 
Acids Res. J2(I):387 (1984)) (available at http://www.gcg.com), using a 

10 NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length 

weight of 1, 2, 3, 4, 5, or 6. In another embodiment, the percent identity between two 
amino acid or nucleotide sequences is determined using the algorithm of E. Meyers 
and W. Miller (CABIOS, 4:11-17 (1989)) which has been incorporated into the 
ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length 

1 5 penalty of 12 and a gap penalty of 4. 

The nucleic acid and protein sequences of the present invention can further be 
used as a "query sequence" to perform a search against databases to, for example, 
identify other family members or related sequences. Such searches can be performed 
using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al (J. Mol. 

20 Biol 2J5:403-10 (1990)). BLAST nucleotide searches can be performed with the 
NBLAST program, score = 100, word length = 12 to obtain nucleotide sequences 
homologous to (with calculatably significant similarity to) the nucleic acid molecules 
of the invention. BLAST protein searches can be performed with the XBLAST 
program, score = 50, word length = 3 to obtain amino acid sequences homologous to 

25 the proteins of the invention. To obtain gapped alignments for comparison purposes, 
Gapped BLAST can be utilized as described in Altschul et al, (Nucleic Acids Res. 
25(17):3389-3402 (1997)). When utilizing BLAST and gapped BLAST programs, 
the default parameters of the respective programs (e.g., XBLAST and NBLAST) can 
be used. See http://www.ncbi.nlm.nih.gov. Similarity for nucleotide and 

30 amino acid sequences can be defined in terms of the parameters set by the Advanced 
Blast search available from NCBI (the National Center for Biotechnology 
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Information; see, for Advanced BLAST page, www.ncbi.nlm.nih.gov/cgi- 
bin/BLAST/nph-newblast?Jform=l). These default parameters, recommended for a 
query molecule of length greater than 85 amino acid residues or nucleotides have 
been set as follows: gap existence cost, 11, per residue gap cost, 1; lambda ratio, 

5 0.85. Further explanation of version 2.0 of BLAST can be found on related website 
pages and in Altschul, S.F. et al, Nucleic Acids Res. 25:3389-3402 (1997).. 

In certain embodiments, a contiguous portion can be about 4, 5, 6, 7, 8, 9, 10, 
15, 25, 30, 40, 50, 75, 100, 150, 200, 300, 400, 500, 600 or more amino acid residues 
in length. In one particular embodiment, the portion or fragment includes the 

10 signature sequence of a FATP polypeptide and is about 360 amino acid residues in 
length. 

The invention further relates to fusion proteins, comprising a FATP or 
functional portion thereof (as described above) as a first moiety, linked to a second 
moiety not occurring in the FATP as found in nature. Thus, the second moiety can be 

15 an amino acid, peptide or polypeptide. The first moiety can be in an N-terminal 
location, C-terminal location or internal to the fusion protein. In one embodiment, 
the fusion protein comprises a FATP as the first moiety, and a second moiety 
comprising a linker sequence and an affinity ligand. Fusion proteins can be produced 
by a variety of methods. For example, a fusion protein can be produced by the 

20 insertion of a FATP gene or portion thereof into a suitable expression vector, such as 
Bluescript SK +/- (Stratagene, La Jolla, CA), pGEX-4T-2 (Pharmacia, Peapack^J), 
pET-24(+) (Novagen, Madison, WI), or vectors of similar construction. The resulting 
construct can be introduced into a suitable host cell for expression. Upon expression, 
fusion protein can be purified from cells by means of a suitable affinity matrix (See 

25 e.g., Current Protocols in Molecular Biology, Ausubel, F.M. et ah, eds., Vol. 2, pp. 
16.4.1-16.7.8, containing supplements up through Supplement 42, 1998). 

The invention also relates to enzymatically produced, synthetically produced, 
or recombinantly produced portions of a fatty acid transport protein. Portions of a 
FATP can be made which have full or partial function on their own, or which when 

30 mixed together (though fully, partially, or nonfunctional alone), spontaneously 
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assemble with one or more other polypeptides to reconstitute a functional protein 
having at least one Amotion characteristic of a FATP. 

Fragments of a FATP can be produced by direct peptide synthesis, for 
example those using solid-phase techniques (Roberge, J.Y. et al. Science 269:202- 
5 204 (1995); Merrifield, J. 3 J. Am, Chem. Soc. 55:2149-2154 (1963)). Protein 

synthesis can be performed using manual techniques or by automation. Automated 
synthesis can be carried out using, for instance, an Applied Biosystems 431 A Peptide 
Synthesizer (Perkin Elmer). Various fragments of a FATP can be synthesized 
separately and combined using chemical methods. 

10 One aspect of the invention is a peptide or polypeptide having the amino acid 

sequence of a portion of a fatty acid transport protein which is hydrophilic rather than 
hydrophobic, and ordinarily can be detected as facing the outside of the cell 
membrane. Such a peptide or polypeptide can be thought of as being an extracellular 
domain of the FATP, or a mimetic of said extracellular domain. It is known, for 

1 5 example, that a portion of human FATP4 that includes a highly conserved motif is 
involved in AMP-CoA binding function (Stuhlsatz-Krouper, S.M. et al y J. Biol 
Chem. 44:28642-28650 (1998)). 

The term "mimetic" as used herein, refers to a molecule, the structure of 
which is developed from knowledge of the structure of the FATP of interest, or one 

20 or more portions thereof, and, as such, is able to effect some or all of the functions of 
a FATP. 

Portions of a FATP can be prepared by enzymatic cleavage of the isolated 
protein, or can be made by chemical synthesis methods, Portions of a FATP can also 
be made by recombinant DNA methods in which restriction fragments, or fragments 

25 that may have undergone further enzymatic processing, or synthetically made DNAs 
are joined together to construct an altered FATP gene. The gene can be made such 
that it encodes one or more desired portions of a FATP. These portions of FATP can 
be entirely homologous to a known FATP, or can be altered in amino acid sequence 
relative to naturally occurring FATPs to enhance or introduce desired properties such 

30 as solubility, stability, or affinity to a ligand. A further feature of the gene can be a 
sequence encoding an N-terminal signal peptide directed to the plasma membrane. 
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An extracellular domain can be determined by a hydrophobicity plot, such as 
those shown in Figures 28A, 29A, and 35A, or by a hydrophilicity plot such as those 
shown in Figures 28C, 29C, 35C, 91, 92 and 93. A polypeptide or peptide 
comprising all or a portion of a FATP extracellular domain can be used in a 
5 pharmaceutical composition. When administered to a mammal by an appropriate 
route, the polypeptide or peptide can bind to fatty acids and compete with the native 
FATPs in the membrane of cells, thereby making fewer fatty acid molecules available 
as substrates for transport into cells, and reducing the amount of fatty acids taken up 
by, for example, the heart, in the case of FATP6. 

10 Another aspect of the invention relates to a method of producing a fatty acid 

transport protein, variants or portions thereof, and to expression systems and host 
cells containing a vector appropriate for expression of a fatty acid transport protein. 

Cells that express a FATP, a variant or a portion thereof, or an ortholog of a 
FATP described herein by amino acid sequence, can be made and maintained in 

1 5 culture, under conditions suitable for expression, to produce protein in the cells for 
cell-based assays, or to produce protein for isolation. These cells can be procaryotic 
or eucaryotic. Examples of procaryotic cells that can be used for expression include 
Escherichia coli, Bacillus subtilis and other bacteria. Examples of eucaryotic cells 
that can be used for expression include yeasts such as Saccharomyces cerevisiae, 

20 Schizosaccharomyces pombe, Pichia pastoris and other lower eucaryotic cells, and 
"cells of higher eucaryotes such as those from insects and mammals, such as primary 
cells and cell lines such as CHO, HeLa, 3T3 and BHK cells, preferably COS cells and 
human kidney 293 cells, and more preferably Jurkat cells. (See, e.g., Ausubel, F.M. 
et al t eds. Current Protocols in Molecular Biology, Greene Publishing Associates 

25 and John Wiley & Sons, Inc., containing Supplements up through Supplement 42, 
1998)). 

In one embodiment, host cells that produce a recombinant FATP, or a portion 
thereof, a variant, or an ortholog of a FATP described herein by amino acid sequence, 
can be made as follows. A gene encoding a FATP, variant or a portion thereof can be 
30 inserted into a nucleic acid vector, e.g., a DNA vector, such as a plasmid, phage, 

cosmid, phagemid, virus, virus-derived vector (e.g., SV40, vaccinia, adenovirus, fowl 
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pox virus, pseudorabies viruses, retroviruses) or other suitable replicon, which can be 
present in a single copy or multiple copies, or the gene can be integrated in a host cell 
chromosome. A suitable replicon or integrated gene can contain all or part of the 
coding sequence for a FATP or variant, operably linked to one or more expression 
5 control regions whereby the coding sequence is under the control of transcription 
signals and linked to appropriate translation signals to permit translation. The vector 
can be introduced into cells by a method appropriate to the type of host cells (e.g., 
transaction, electroporation, infection). For expression from the FATP gene, the 
host cells can be maintained under appropriate conditions (e.g., in the presence of 

1 0 inducer, normal growth conditions, etc.). Proteins or polypeptides thus produced can 
be recovered (e.g., from the cells, as in a membrane fraction, from the periplasmic 
space of bacteria, from culture medium) using suitable techniques. Appropriate 
membrane targeting signals may be incorporated into the expressed polypeptide. 
These signals may be endogenous to the polypeptide or they may be heterologous 

15 signals. 

Polypeptides of the invention can be recovered and purified from cell cultures 
(or from their primary cell source) by well-known methods including ammonium 
sulfate or ethanol precipitation, acid extraction, anion or cation exchange 
chromatography, phosphocellulose chromatography, hydrophobic interaction 

20 chromatography, affinity chromatography, hydroxylapatite chromatography and high 
performance liquid chromatography. Known methods for refolding protein can be 
used to regenerate active conformation if the polypeptide is denatured during 
isolation or purification. 

In a further aspect of the invention are methods for assessing the transport 

25 function of any of the fatty acid transport proteins or polypeptides described herein, 
including orthologs, and in variations of these, methods for identifying an inhibitor 
(or an enhancer) of such function and methods for assessing the transport function in 
the presence of a candidate inhibitor or a known inhibitor. 

A variety of systems comprising living cells can be used for these methods. 

30 Cells to be used in fatty acid transport assays, and further in methods for identifying 
an inhibitor or enhancer of this function, express one or more FATPs. See Examples 
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3, 6, 9, 12 and 14 for data on tissue distribution of expression of FATPs, and 
Examples 10 and 1 1 describing recombinant cells expressing FATP. Cells for use in 
cell-based assays described herein can be drawn from a variety of sources, such as 
isolated primary cells of various organs and tissues wherein one or more FATPs are 
5 naturally expressed. In some cases, the cells can be from adult organs, and in some 
cases, from embryonic or fetal organs, such as heart, lung, liver, intestine, skeletal 
muscle, kidney and the like. Cells for this purpose can also include cells cultured as 
fragments of organs or in conditions simulating the cell type and/or tissue 
organization of organs, in which artificial materials may be used as substrates for cell 

10 growth. Other types of cells suitable for this purpose include cells of a cell strain or 
cell line (ordinarily comprising cells considered to be "transformed") transfected to 
express one or more FATPs. 

A further embodiment of the invention is a method for detecting, in a sample 
of cells, a fatty acid transport protein, a portion or fragment thereof, a fusion protein 

15 comprising a FATP or a portion thereof, or an ortholog as described herein, wherein 
the cells can be, for instance, cells of a tissue, primary culture cells, or cells of a cell 
line, including cells into which nucleic acid has been introduced. The method 
comprises adding to the sample an agent that specifically binds to the protein, and 
detecting the agent specifically bound to the protein. Appropriate washing steps can 

20 be added to reduce nonspecific binding to the agent. The agent can be, for example, 
an antibody, a ligand or a substrate mimic. The agent can have incorporated into it, 
or have bound to it, covalently or by high affinity non-covalent interactions, for - 
instance, a label that facilitates detection of the agent to which it is bound, wherein 
the label can be, but is not limited to, a phosphorescent label, a fluorescent label, a 

25 biotin or avidin label, or a radioactive label. The means of detection of a fatty acid 
transport protein can vary, as appropriate to the agent and label used. For example, 
for an antibody that binds to the fatty acid transport protein, the means of detection 
may call for binding a second antibody, which has been conjugated to an enzyme, to 
the antibody which binds the fatty acid transport protein, and detecting the presence 

30 of the second antibody by means of the enzymatic activity of the conjugated enzyme. 
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Similar principles can also be applied to a cell lysate or a more purified 
preparation of proteins from cells that may comprise a fatty acid transport protein of 
interest, for example in the methods of immunoprecipitation, immunoblotting, 
immunoaffinity methods, that in addition to detection of the particular FATP, can 

5 also be used in purification steps, and qualitative and quantitative immunoassays. 
See, for instance, chapters 11 through 14 in Antibodies: A Laboratory Manual, E. 
Harlow and D. Lane, eds., Cold Spring Harbor Laboratory, 1988. 

Isolated fatty acid transport protein or, an antigenically similar portion thereof, 
especially a portion that is soluble, can be used in a method to select and identify 

10 molecules which bind specifically to the FATP. Fusion proteins comprising all of, or 
a portion of, the fatty acid transport protein linked to a second moiety not occurring in 
the FATP as found in nature, can be prepared for use in another embodiment of the 
method. Suitable fusion proteins for this purpose include those in which the second 
moiety comprises an affinity ligand (e.g., an enzyme, antigen, epitope). FATP fusion 

1 5 proteins can be produced by the insertion of a gene encoding the FATP or a variant 
thereof, or a suitable portion of such gene into a suitable expression vector, which 
encodes an affinity ligand (e.g., pGEX-4T-2 and pET-15b, encoding glutathione S- 
transferase and His-Tag affinity ligands, respectively). The expression vector can be 
introduced into a suitable host cell for expression. Host cells are lysed and the lysate, 

20 containing fusion protein, can be bound to a suitable affinity matrix by contacting the 
lysate with an affinity matrix. In a particular embodiment, a nucleic acid 

encodes a portion of a FATP polypeptide which includes a motif or domain, for 
example, a lipocalin domain or an AMP-binding domain. Such a polypeptide portion 
can be a functional portion of a FATP protein. The term "lipocalin domain" is an art 

25 recognized term and as used herein refers to a particular domain present in FATP 
proteins. This domain is described as including regions of sequence homology as 
well as a common tertiary structure represented as an eight stranded antiparallel beta- 
barrel, (see Banaszak, L. et aL, Advances in Protein Chemistry, 45\ 89-151). Many 
lipocalin domains can be identified structurally as a sequence contained within the 

30 general formula: [DENG]-X-[DENQGSTARK]-X(0,2)-(pENQAKK]-[LIVFY]- 

{CP}-G-{C}-W-[FYWLRH-X]-[LIVMTA], e.g., the lipocalin signature sequence or 
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consensus pattern (SEQ ID NO: 125). One skilled in the art will recognize that a 
lipocalin domain for a particular FATP protein can vary in sequence from this general 
formula. A FATP lipocalin domain can be, for example, identical to the lipocalin 
signature sequence, or, can exhibit 60, 65, 70, 75, 80, 85, 90, 95 or greater sequence 
5 percent identity in comparison to the general formula provided that it still retains the 
necessary lipocalin binding function. 

For example, a lipocalin domain for each of the human FATPs, hsFATPl 
(SEQ ID NO: 126), hsFATP2 (SEQ ID NO: 127), hsFATP3 (SEQ ID NO: 1280, 
hsFATP4 (SEQ ID NO: 129), hsFATP5 (SEQ ID NO: 130), and hsFATP6 (SEQ ED 
10 NO: 131) has been identified. These particular lipocalin domains are located near the 
N-terminal portion of the specified proteins (see Figure 118). The sequences of these 
lipocalin domains are highly conserved across the FATP family. A search using the 
lipocalin signature sequence conducted on a public database 
(www.ebi. ac.uk/interproyQ , indicated that the lipocalin domains of hsFATPl and 
15 hsFATP4 share identity with signature sequence. In addition, a search directed to 

identifying sequences having at least 80% identity to the lipocalin signature sequence 
identified three additional human FATPs, hsFATP3, hsFATPS and hsFATP6. 

A lipocalin domain can also be identified functionally since, for example, it 
has been identified as a binding motif capable of binding fatty acids. In particular, 
20 the studies described in Experiment 20 demonstrated that fusion proteins including 
the lipocalin domains from hsFATP4 bound long chain fatty acids such as oleates and 
palmitates with great specificity. Other fatty acids can also be used to assess binding 
in FATP4 and other members of the FATP family. 

Polypeptides, including fusion polypeptides, which contain a lipocalin domain 
25 can also include additional components. For example, fusion polypeptides containing 
a lipocalin domain can include amino acid residues from the portion of the protein 
which is located upstream, i. e., in the direction of the N-terminal end of a FATP 
protein, from the lipocalin domain. As the term "upstream sequences" is used herein 
in relation to the lipocalin domain, it is intended to refer to the amino acid residues of 
30 a FATP protein which are located between the signal peptide (when one is present) 
and the lipocalin domain. In the absence of a signal peptide, the term refers to the 
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portion of a FATP protein between the lipocalin domain and the amino terminus (see 
Figure 115). 

Fusion polypeptides which contain a lipocalin domain can also include 
additional domains or motifs, for example, an AMP binding domain can be included. 
5 For example, an AMP binding domain for each of the human FATTs, hsFATPl (SEQ 
ID NO: 132), hsFATP2 (SEQ ID NO: 133), hsFATP3 (SEQ ID NO: 134), hsFATP4 
(SEQ ID NO: 135), hsFATPS (SEQ ID NO: 136) and hsFATP6 (SEQ ID NO: 137) 
has been identified (see Figure 1 1 8). 

In one embodiment, the fusion protein can be immobilized on a suitable 

10 affinity matrix under conditions sufficient to bind the affinity ligand portion of the 
fusion protein to the matrix, and is contacted with one or more candidate binding 
agents (e.g., a mixture of peptides) to be tested, under conditions suitable for binding 
of the binding agents to the FATP portion of the bound fusion protein. Next, the 
affinity matrix with bound fusion protein can be washed with a suitable wash buffer 

15 to remove unbound candidate binding agents and non-specifically bound candidate 
binding agents. Those agents which remain bound can be released by contacting the 
affinity matrix with fusion protein bound thereto with a suitable elution buffer. Wash 
buffer can be formulated to permit binding of the fusion protein to the affinity matrix, 
without significantly disrupting binding of specifically bound binding agents. In this 

20 aspect, elution buffer can be formulated to permit retention of the fusion protein by 
the affinity matrix, but can be formulated to interfere with binding of the candidate 
binding agents to the target portion of the fusion protein. For example, a change in 
the ionic strength or pH of the elution buffer can lead to release of specifically bound 
agent, or the elution buffer can comprise a release component or components 

25 designed to disrupt binding of specifically bound agent to the target portion of the 
fusion protein. 

Immobilization can be performed prior to, simultaneous with, or after, 
contacting the fusion protein with candidate binding agent, as appropriate. Various 
permutations of the method are possible, depending upon factors such as the 
30 candidate molecules tested, the affinity matrix-ligand pair selected, and elution buffer 
formulation. For example, after the wash step, fusion protein with binding agent 
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molecules bound thereto can be eluted from the affinity matrix with a suitable elution 
buffer (a matrix elution buffer, such as glutathione for a GST fusion). Where the 
fusion protein comprises a cleavable linker, such as a thrombin cleavage site, 
cleavage from the affinity ligand can release a portion of the fusion with the candidate 
5 agent bound thereto. Bound agent molecules can then be released from the fusion 
protein or its cleavage product by an appropriate method, such as extraction. 

One or more candidate binding agents can be tested simultaneously. Where a 
mixture of candidate binding agents is tested, those found to bind by the foregoing 
processes can be separated (as appropriate) and identified by suitable methods (e.g., 

10 PCR, sequencing, chromatography). Large libraries of candidate binding agents (e.g., 
peptides, RNA oligonucleotides) produced by combinatorial chemical synthesis or by 
other methods can be tested (see e.g., Ohlmeyer, M.H.J, et al., Proc. Natl Acad. Scl 
USA 90:10922-10926 (1993) and DeWitt, S.H. et al, Proc. Natl Acad. Set USA 
90:6909-6913 (1993), relating to tagged compounds; see also Rutter, WJ. et al U.S. 

15 Patent No. 5,010,175; Huebner, V.D. et al, U.S. Patent No. 5,182,366; and Geysen, 
H.M., U.S. Patent No. 4,833,092). Random sequence RNA libraries (see Ellington, 
A.D. etal, Nature 34<5:818-822 (1990); Bock, L.C. et al, Nature 355:584-566 
(1992); and Szostak, J.W., Trends in Biochem. Set 77:89-93 (March, 1992)) can also 
be screened according to the present method to select RNA molecules which bind to a 

20 target FATP or FATP fusion protein. Where binding agents selected from a 
combinatorial library by the present method carryuhique tags, identification of 
individual biomolecules by chromatographic methods is possible. Where binding 
agents do not carry tags, chromatographic separation, followed by mass spectrometry 
to ascertain structure, can be used to identify binding agents selected by the method, 

25 for example. 

The invention also comprises a method for identifying an agent which inhibits 
interaction between a fatty acid transport protein (e.g., one comprising the amino acid 
* sequence in SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO: 11 7, SEQ ID NO:53, SEQ 
ID NO:55, or SEQ ID NO:57), and a ligand of said protein. The FATP can be one 
30 described by an amino acid sequence herein, a portion or fragment thereof, a variant 
thereof, or an ortholog thereof, or a FATP fusion protein. Here, a ligand can be, for 
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instance, a substrate, or a substrate mimic, an antibody, or a compound, such as a 
peptide, that binds with specificity to a site on the protein. The method comprises 
combining, not limited to a particular order, the fatty acid protein, the ligand of the 
protein, and a candidate agent to be assessed for its ability to inhibit interaction 
5 between the protein and the ligand, under conditions appropriate for interaction 
between the protein and the ligand (e.g., pH, salt, temperature conditions conducive 
to appropriate conformation and molecular interactions); determining the extent to 
which the protein and ligand interact; and comparing (1) the extent of protein-ligand 
interaction in the presence of candidate agent with (2) the extent of protein-ligand 
10 interaction in the absence of candidate agent, wherein if (1) is less than (2), then the 
candidate agent is one which inhibits interaction between the protein and the ligand. 

The method can be facilitated, for example, by using an experimental system 
which employs a solid support (column chromatography matrix, wall of a plate, 
microtiter wells, column pore glass, pins to be submerged in a solution, beads, etc.) 
15 to which the protein can be attached. Accordingly, in one embodiment, the protein 
can be fixed to a solid phase directly or indirectly, by a linker. The candidate agent to 
be tested is added under conditions conducive for interaction and binding to the 
protein. The ligand is added to the solid phase system under conditions appropriate 
for binding. Excess ligand is removed, as by a series of washes done under 
20 conditions that do not disrupt protein-ligand interactions. Detection of bound ligand 
can be facilitated by using a ligand that carries~aTabel (e.g., fluorescent, 
chemiluminescent, radioactive). In a control experiment, protein and ligand are 
allowed to interact in the absence of any candidate agent, under conditions otherwise 
identical to those used for the "test" conditions where candidate inhibiting agent is 
25 present, and any washes used in the test conditions are also used in the control. The 
extent to which ligand binds to the protein in the presence of candidate agent is 
compared to the extent to which ligand binds to the protein in the absence of the 
candidate agent. If the extent to which interaction of the protein and the ligand 
occurs is less in the presence of the candidate agent than in the absence of the 
30 candidate agent, the candidate agent is an agent which inhibits interaction between 
the protein and the ligand of the protein. 
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In a further embodiment, an inhibitor (or an enhancer) of a fatty acid transport 
protein can be identified. The method comprises steps which are, or are variations of 
the following: contacting the cells with fatty acid, wherein the fatty acid can be 
labeled for convenience of detection; contacting a first aliquot of the cells with an 
5 agent being tested as an inhibitor (or enhancer) of fatty acid uptake while maintaining 
a second aliquot of cells under the same conditions but without contact with the 
agent; and measuring (e.g., quantitating) fatty acid in the first and second aliquots of 
cells; wherein a lesser quantity of fatty acid in the first aliquot compared to that in 
the second aliquot is indicative that the agent is an inhibitor of fatty acid uptake by a 

10 fatty acid transport protein. A greater quantity of fatty acid in the first aliquot 

compared to that in the second aliquot is indicative that the agent is an enhancer of 
fatty acid uptake by a fatty acid transport protein. 

A particular embodiment of identifying an inhibitor or enhancer of fatty acid 
transport function employs the above steps, but also employs additional steps 

15 preceding those given above: introducing into cells of a cell strain or cell line ("host 
cells" for the intended introduction of, or after the introduction of, a vector) a vector 
comprising a fatty acid transport protein gene, wherein expression of the gene can be 
regulatable or constitutive, and providing conditions to the host cells under which 
expression of the gene can occur. 

20 The terms "contacting" and "combining" as used herein in the context of 

bringing molecules into close proximity to each other, can be accomplished by 
conventional means. For example, when referring to molecules that are soluble, 
contacting is achieved by adding the molecules together in a solution. "Contacting" 
can also be adding an agent to a test system, such as a vessel containing cells in tissue 

25 culture. 

The term "inhibitor" or "antagonist", as used herein, refers to an agent which 
blocks, diminishes, inhibits, hinders, limits, decreases, reduces, restricts or interferes 
with fatty acid transport into the cytoplasm of a cell, or alternatively and additionally, 
prevents or impedes the cellular effects associated with fatty acid transport. The term 
30 "enhancer" or "agonist", as used herein, refers to an agent which augments, enhances, 
or increases fatty acid transport into the cytoplasm of a cell. An antagonist will 
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decrease fatty acid concentration, fatty acid metabolism and byproduct levels in the 
t 

cell, leading to phenotypic and molecular changes. 

In order to produce a "host cell" type suitable for fatty acid uptake assays and 
for assays derived therefrom for identifying inhibitors or enhancers thereof, a nucleic 
5 acid vector can be constructed to comprise a gene encoding a fatty acid transport 
protein, for example, human FATP1, FATP2, FATP3, FATP4, FATP5, FATP6, a 
mutant or variant thereof, an ortholog of the human proteins, such as mouse orthologs 
or orthologs found in other mammals, or a FATP family protein of origin in an 
organism other than a mammal. The gene of the vector can be regulatable, such as by 

1 0 the placement of the gene under the control of an inducible or repressible promoter in 
the. vector (e.g., inducible or repressible by a change in growth conditions of the host 
cell harboring the vector, such as addition of inducer, binding or functional removal 
of repressor from the cell millieu, or change in temperature) such that expression of 
the FATP gene can be turned on or initiated by causing a change in growth 

15 conditions, thereby causing the protein encoded by the gene to be produced, in host 
cells comprising the vector, as a plasma membrane protein. Alternatively, the FATP 
gene can be constitutively expressed. 

A vector comprising a FATP gene, such as a vector described herein, can be 
introduced into host cells by a means appropriate to the vector and to the host cell 

20 type. For example, commonly used methods such as electroporation, transfection, 
for instance, transfection using CaCl 2 , and transduction (as for a virus or 
bacteriophage) can be used. Host cells can be, for example, mammalian cells such as 
primary culture cells or cells of cell lines such as COS cells, 293 cells or Jurkat cells. 
Host cells can also be, in some cases, cells derived from insects, cells of insect cell 

25 lines, bacterial cells, such as E. coli, or yeast cells, such as S. cerevisiae. It is 

preferred that the fatty acid transport protein whose function is to be assessed, with or 
without a candidate inhibitor or enhancer, be produced in host cells whose ancestor 
cells originated in a species related to the species of origin of the FATP gene 
encoding the fatty acid transport protein. For example, it is preferable that tests of 

30 function or of inhibition or enhancement of a mammalian FATP be carried out in host 
mammalian cells producing the FATP, rather than bacterial cells or yeast cells. 
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Host cells comprising a vector comprising a regulatable FATP gene can be 
treated so as to allow expression of the FATP gene and production of the encoded 
protein (e.g., by contacting the cells with an inducer compound that effects 
transcription from an inducible promoter operably linked to the FATP gene). 
5 Alternatively, host cells containing an endogenous FATP gene can be 

engineered to activate or deactivate expression of the FATP gene and production of 
the encoded protein. For example, homologous recombination, often referred to as 
targeting, can be utilized to alter the regulatory region associated with the FATP gene 
to increase or decrease the level of expression. Alteration of the regulatory. region 
10 can include disablement of the regulatory region associated with the FATP gene 
and/or replacement of the region or a portion of the region. A variety of regulatory 
regions are known which can be transfected into cells to cause an endogenous gene to 
display a pattern of induction or expression that' differs from that of the cell prior to 
transfection. 

15 The test agent (e.g., an agonist or antagonist) is added to the cells to be used 

in a fatty acid transport assay, in the presence or absence of test agent, under 
conditions suitable for production and/or maintenance of the expressed FATP in a 
conformation appropriate for association of the FATP with test agent and substrate. 
For example, conditions under which an agent is assessed, such as media and 

20 temperature requirements, can, initially, be similar to those necessary for transport of 
typical fatty acid substrates across the plasma membrane. One of ordinary skill in the 
art will know how to vary experimental conditions depending upon the biochemical 
nature of the test agent. The test agent can be added to the cells in the presence of 
fatty acid, or in the absence of fatty acid substrate, with the fatty acid substrate being 

25 added following the addition of the test agent. The concentration at which the test 
agent can be evaluated can be varied, as appropriate, to test for an increased effect 
with increasing concentrations. 

Test agents to be assessed for their effects on fatty acid transport can be any 
chemical (element, molecule, compound), made synthetically, made by recombinant 

30 techniques or isolated from a natural source. For example, test agents can be 

peptides, polypeptides, peptoids, sugars, hormones, or nucleic acid molecules, such as 
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antisense nucleic acid molecules. In addition, test agents can be small molecules or 
molecules of greater complexity made by combinatorial chemistry, for example, and 
compiled into libraries. These libraries can comprise, for example, alcohols, alkyl 
halides, amines, amides, esters, aldehydes, ethers and other classes of organic 
5 compounds. Test agents can also be natural or genetically engineered products 
isolated from lysates of cells, bacterial, animal or plant, or can be the cell lysates 
themselves. Presentation of test compounds to the test system can be in either an 
isolated form or as mixtures of compounds, especially in initial screening steps. 

Thus, the invention relates to a method for identifying agents which alter fatty 

10 acid transport, the method comprising providing the test agent to the cell (wherein 
"cell" includes the plural, and can include cells of a cell strain, cell line or culture of 
primary cells or organ culture, for example), under conditions suitable for binding to 
its target, whether to the FATP itself or to another target on or in the cell, wherein the 
transformed cell comprises a FATP. 

15 In greater detail, to test one or more agents or compounds (e.g., a mixture of 

compounds can conveniently be screened initially) for inhibition of the transport 
function of a fatty acid transport protein, the agent(s) can be contacted with the cells. 
The cells can be contacted with a labeled fatty acid. The fatty acid can be, for 
example, a known substrate of the fatty acid transport protein such as oleate or 

20 palmitate. The fatty acid can itself be labeled with a radioactive isotope, (e.g., 3 H or 
I4 C) or can have a radioactively labeled adduct attached. In other variations, the fatty 
acid can have chemically attached to it a fluorescent label, or a substrate for an 
enzyme occurring within the cells, wherein the substrate yields a detectable product, 
such as a highly colored or fluorescent product. Addition of candidate inhibitors and 

25 labeled substrate to the cells comprising fatty acid transport protein can be in either 
order or can be simultaneous. 

A second aliquot of cells, which can be called "control" cells (a "first" aliquot 
of cells can be called "test" cells), is treated, if necessary (as in the case of 
transformed "host"cells), so as to allow expression of the FATP gene, and is 

30 contacted with the labeled substrate of the fatty acid transport protein. The second 
aliquot of cells is not contacted with one or more agents to be tested for inhibition of 
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the transport function of the protein produced in the cells, but is otherwise kept under 
the same culture conditions as the first aliquot of cells. 

In a further step of a method to identify inhibitors of a fatty acid transport 
protein, the labeled fatty acid is measured in the first and second aliquots of cells. A 
5 preliminary step of this measurement process can be to separate the external medium 
from the cells so as to be able to distinguish the labeled fatty acid external to the cells 
from that which has been transported inside the cells. This can be accomplished, for 
instance, by removing the cells from their growth container, centrifuging the cell 
suspension, removing the supernatant and performing one or more wash steps to 

10 extensively dilute the remaining medium which may contain labeled fatty acid. 
Detection of the labeled fatty acid can be by a means appropriate to the label used. 
For example, for a radioactive label, detection can be by scintillation counting of 
appropriately prepared samples of cells (e.g., lysates or protein extracts); for a 
fluorescent label, by measuring fluorescence in the cells by appropriate 

15 instrumentation. 

If a compound tested as a candidate inhibitor of transport function causes the 
test cells to have less labeled fatty acid detected in the cells than that detected in the 
control cells, then the compound is an inhibitor of the fatty acid transport protein. 
Procedures analogous to those above can be devised for identifying enhancers 

20 (agonists of FATPs) of fatty acid transport function wherein if the test cells contain 
more labeled fatty acid than that detected in the control cells, or if the fatty acid is 
taken up at a higher rate, then the compound being tested can be concluded to be an 
enhancer of the fatty acid transport protein. 

Example 13 describes use of an assay of this type to identify an inhibitor of a 

25 FATP. In Example 13, an antisense oligonucleotide which specifically inhibits 

biosynthesis of mmFATP4 was demonstrated to inhibit fatty acid uptake into mouse 
enterocytes. Similarly, antisense oligonucleotides directed towards specifically 
inhibiting the biosynthesis of FATP6 in heart cells, FATPS in liver cells, FATP3 in 
lung cells, and FATP2 in colon cells, can be demonstrated as examples of "test 

30 agents" that inhibit fatty acid transport. 
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Another assay to determine whether an agent is an inhibitor (or enhancer) of 
fatty acid transport employs animals, one or more of which are administered the 
agent, and one or more of which are maintained under similar conditions, but are not 
administered the agent. Both groups of animals are given fatty acids (e.g., orally, 
5 intravenously, by tube inserted into stomach or intestine), and the fatty acids taken up 
into a bodily fluid (e.g., serum) or into an organ or tissue of interest are measured 
from comparable samples taken from each group of animals. The fatty acids may 
carry a label (e.g., radioactive) to facilitate detection and quantitation of fatty acids 
taken up into the fluid or tissue being sampled. This type of assay can be used alone 

10 or can be used in addition to in vitro assays of a candidate inhibitor or enhancer. 

An agent determined to be an inhibitor (or enhancer) of FATP 
function, such as fatty acid binding and/or fatty acid uptake, can be administered to 
cells in culture, or in vivo, to a mammal (e.g. human) to inhibit (or enhance) FATP 
function. Such an agent may be one that acts directly on the FATP (for example, by 

1 5 binding) or can act on an intermediate in a biosynthetic pathway to produce FATP, 

such as transcription of the FATP gene, processing of the mRNA, or translation of the 
mRNA. An example of such an agent is antisense oligonucleotide. 

Antisense methods similar to those illustrated in Example 13 can be used to 
determine the target FATP of a compound or agent that has an inhibitory or 

20 enhancing effect on fatty acid uptake. For example, antisense oligonucleotide 

directed to the inhibition of FATP4 biosynthesis can be added to lung cells or cell 
lines derived from lung cells. In addition, antisense oligonucleotides directed to the 
inhibition of other FATPs, except for FATP3, can also be added to the lung cells. 
The administration of antisense oligonucleotides in this manner ensures that the 

25 predominant FATP activity remaining in the cells comes from FATP3. After a period 
of incubation of the cells with the antisense oligonucleotides sufficient to deplete the 
plasma membrane of the FATPs whose biosynthesis has been inhibited, a test agent, 
preferably one that has been shown by some preliminary test to have an inhibitory or 
enhancing activity on fatty acid transport, can be added to the lung cells. If the test 

30 agent is now demonstrated, after treatment of the cells with antisense 

oligonucleotides, to have an inhibitory or enhancing activity on fatty acid transport in 
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the lung cells, it can be concluded that the target of the test agent is FATP3, or a 
molecule involved in the biosynthesis or activity of FATP3. 

In another type of cell-based assay for uptake of fatty acids, a change of 
intracellular pH resulting from the uptake of fatty acids can be followed by an 

5 indicator fluorophore. The fluorophore can be taken up by the cells in a 

preincubation step. Fatty acids can be added to the cell medium, and after some 
period of incubation to allow FATP-mediated uptake of fatty acids, the change in A max 
of fluorescence can be measured, as an indicator of a change in intracellular pH, as 
the X mBX of fluorescence of the fluorophore changes with the pH of its environment, 

10 thereby indicating uptake of fatty acids. One such fluorophore is BCECF (2', 7'- 
bis(2-carboxyethyl)-5(6)- carboxyfluorescein; Rink, T.J. et ah, X Cell Biol 95: 189 
(1982)). 

In assays similar to those described above, a candidate inhibitor or enhancer 
of fatty acid transport function can be added (or mock-added, for control cultures) to 

15 cultures of cells engineered to express a desired FATP to which fatty acid substrate is 
also added. Inhibition of fatty acid uptake is indicated by a lack of the drop in pH, 
indicating fatty acid uptake, that is seen in control cells. Enhancement of fatty acid 
uptake is indicated by a decrease in intracellular pH, as compared to control cells not 
receiving the candidate enhancer of fatty acid transport function. 

20 Yeast cells can be used in a similar cell-based assay for the uptake of fatty 

acids mediated by^a FATP, and such an assay can be adapted to a screening assay for 
the identification of agents that inhibit or enhance fatty acid uptake by an FATP. 
Yeast cells lacking an endogenous FATP activity (mutated, disrupted or deleted for 
FAT1; Faergeman, N.J. et al, J. Biol Chem. 272(13):853 1-8538 (1997); Watkins, 

25 P-A. et al, J, Biol Chem. 273(29):18210-18219 (1998)) can be engineered to harbor 
a related gene of the family of FATP-encoding genes, such as a mammalian FATP 
(e.g., human FATP4). 

Examples of expression vectors include pEG (Mitchell, D.A., et al 9 Yeast 
9:715-723 (1993)) and pDADl and pDAD2, which contain a GAL1 promoter (Davis, 

30 L. I. and Fink, G. R. ? Cell 57:965-978 (1990)). A variety of promoters are suitable 
for expression. Available yeast vectors offer a choice of promoters. In one 
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embodiment, the inducible GAL1 promoter is used. In another embodiment, the 
constitutive ADH1 promoter (alcohol dehydrogenase; Bennetzen, J. L. and Hall, B. 
D., J. Biol Chem. 257:3026-303 1 (1982)) can be used to express an inserted gene on 
glucose-containing media. An example of a vector suitable for expression of a 
5 heterologous F ATP gene in yeast is pQB 1 69. 

With the introduced FATP gene providing the only fatty acid transport protein 
function for the yeast cells, it is possible to study effect of the heterologous FATP on 
fatty acid transport into the yeast cells in isolation. Assays for the uptake of fatty 
acids into the yeast cells can be devised that are similar to those described above 

10 and/or those assays that have been illustrated in the Examples. Tests for candidate 
inhibitors or enhancers of the heterologous FATP can be done in cultures of yeast 
cells, wherein the yeast cells are incubated with fatty acid substrate and an agent to be 
tested as an inhibitor or enhancer of FATP function. FATP uptake after a period of 
time can be measured by analyzing the contents of the yeast cells for fatty acid 

15 substrate, as compared with control yeast cells incubated with the fatty acid, but not 
with the test agent. Yeast cells have the additional advantage, over mammalian cells 
in culture, for example, that yeast cells can be forced to rely upon fatty acids as their 
only source of carbon, if the growth medium supplied to the yeast cells is formulated 
to contain no other source of carbon. Thus, the effect of the heterologous FATP on 

20 fatty acid uptake and metabolism in the engineered yeast cells can be amplified. An 
agent that efficiently blocks transport function of the heterologous FATP could result 
in death of the yeast cells. Thus, in this case, inhibition of function of the 
heterologous FATP can result in loss of viability. A simple measure of viability is 
turbidity of the yeast suspension culture, which can be adapted to a high throughput 

25 screening assay for effects of various agents to be tested, using microtiter plates or 
similar devices for small- volume cultures of the engineered yeast cells. 

Cell-free assays can also be used to measure the transport of fatty acids across 
a membrane, and therefor also to assess a test treatment or test agent for its effect on 
the rate or extent of fatty acid transport. An isolated FATP, for example in the 

30 presence of a detergent that preserves the native 3-dimensional structure of the FATP, 
or partially purified FATP, can be used in an artificial membrane system typically 
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used to preserve the native conformation and activity of membrane proteins. Such 
systems include liposomes, artificial bilayers of phospholipids, isolated plasma 
membrane such as cell membrane fragments, cell membrane fractions, or cell 
membrane vesicles, and other systems in which the FATP can be properly oriented 
5 within the membrane to have transport activity. Assays for transport activity can be 
performed using methods analogous to those that can be used in cells engineered to 
predominantly express one FATP whose function is to be measured. A labeled (e.g., 
radioactively labeled) fatty acid substrate can be incubated with one side of a bilayer 
or in a suspension of liposomes constructed to integrate a properly oriented FATP. 

10 The accumulation of fatty acids with time can be measured, using appropriate means 
to detect the label (e.g., scintillation counting of medium on each side of the bilayer, 
or of the contents of liposomes isolated from the surrounding medium). Assays such 
as these can be adapted to use for the testing of agents which might interact with the 
FATP to produce an inhibitory or an enhancing effect on the rate or extent of fatty 

15 acid transport. That is, the above-described assay can be done in the presence or 
absence of the agent to be tested, and the results compared. 

For examples of isolation of membrane proteins (ADP/ATP carrier and 
uncoupling protein), reconstitution into phospholipid vesicles, and assays of 
transport, see Klingenberg, M. et al, Methods Enzymol 250:369-389 (1995). For an 

20 example of a membrane protein (phosphate carrier of Saccharomyces cerevisiae) that 
was purified and solubilized from E. coli inclusion bodies, see Schroer, A. et al., J. 
Biol Chem. 273: 14269-14276 (1998). The Glutl glucose transporter of rat has been 
expressed in yeast. A crude membrane fraction of the yeast was prepared and 
reconstituted with soybean phospholipids into liposomes. Glucose transport activity 

25 could be measured in the liposomes (Kasahara, T. and Kasahara, M., J. Biol Chem. 
273: 29113-291 17 (1998)). Similar methods can be applied to the proteins and 
polypeptides of the invention. 

Another embodiment of the invention is a method for inhibiting fatty acid 
uptake in a mammal (e.g., a human), comprising administering to the mammal a 

30 therapeutically effective amount of an inhibitor of the transport function of one or 
more of the fatty acid transport proteins, thereby decreasing fatty acid uptake by cells 
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comprising the fatty acid protein(s). Where it is desirable to reduce the uptake of 
fatty acids, for example, in the treatment of chronic obesity or as a part of a program 
of weight control or hyperlipidemia control in a human, one or more inhibitors of one 
or more of the fatty acid transport proteins can be administered in an effective dose, 
5 and by an effective route, for example, orally, or by an indwelling device that can 
deliver doses to the small intestine. The inhibitor can be one identified by methods 
described herein, or can be one that is, for instance, structurally related to an inhibitor 
identified by methods described herein (e.g., having chemical adducts to better 
stabilize or solubilize the inhibitor). The invention further relates to compositions 

10 comprising inhibitors of fatty acid uptake in a mammal, which may further comprise 
pharmaceutical carriers suitable for administration to a subject mammal, such as 
sterile solubilizing or emulsifying agents. 

A further embodiment of the present invention is a method of enhancing or 
increasing fatty acid uptake, such as enhancing or increasing LCFA uptake in the 

15 small intestine (e.g., to treat or prevent a malabsorption syndrome or other wasting 
condition) or in the liver (e.g., by an enhancer of FATP5 transport activity to treat 
acute liver failure) or in the kidney (e.g., by an enhancer of FATP2 transport activity 
to treat kidney failure). In this embodiment, a therapeutically effective amount of an 
enhancer of the transport function of one or more of the fatty acid transport proteins 

20 can be administered to a mammalian subject, with the result that fatty acid uptake in 
the small intestine is enhanced. In this embodiment, one or more enhancers of one ox~ 
more of fatty acid transport proteins is administered in an effective dose and by a 
route (e.g., orally or by a device, such as an indwelling catheter or other device) 
which can deliver doses to the gut. The enhancer of FATP function (e.g., an enhancer 

25 of FATP4 function) can be identified by methods described herein or can be one that 
is structurally similar to an enhancer identified by methods described herein. 

Aerobic reperfusion of ischemic myocardium is a common clinical event 
which can occur during such treatments as cardiac surgery, angioplasty, and 
thrombolytic therapy after a myocardial infarction. During reperfusion, a rapid 

30 recovery of myocardial energy production is essential for the complete recovery of 
contractile function. Not only the extent of recovery of myocardial energy 
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metabolism but also the type of energy substrate used by the heart during reperfusion 
are important determinants of functional recovery. Circulating fatty acid levels 
increase following acute myocardial infarction or during cardiac surgery, such that 
during and following ischemia the heart muscle can be exposed to very high 
5 concentrations of fatty acids (Lopaschuk, G.D. and W. C. Stanley, Science and 
Medicine (November/December 1997)). High plasma fatty acid concentrations 
increase the severity of ischemic damage in a number of experimental models of 
cardiac ischemia and have been linked to depression of mechanical function during 
aerobic reperfusion of previously ischemic hearts. Further data show that modifying 
10 fatty acid utilization can be beneficial for heart function in ischemia and can be a 
useful approach for the treatment of angina. See, e.g., Desideri and Celegon, Am. J. 
Cardiol S2(54):50K-53K; Lopaschuk, Am. J. Cardiol 82(5A)\\4KA1Y^ Plasma 
fatty acid concentrations can be reduced by administering to a human subject or other 
mammal an effective amount of an inhibitor of a FATP such as FATP2 or FATP4, 
1 5 thereby providing a way of reducing fatty acid utilization by the heart. 

In a further embodiment of the invention, a therapeutically effective amount 
of an inhibitor of hsFATP6 can be administered to a human patient by a suitable 
route, to reduce the uptake of fatty acids by cardiac muscle. This treatment is 
desirable in patients who are diagnosed as having, or who are at risk of, abnormal 
20 accumulations of fatty acids in the heart or a detrimentally high rate of uptake of fatty 
acids into the heart, because of ischemic heart disease, or following ischemia or 
trauma to the heart. 

The invention further relates to antibodies that bind to an isolated or 
recombinant fatty acid transport protein of the FATP family, including portions of 
25 antibodies, which can specifically recognize and bind to one or more FATPs. The 
antibodies and portions thereof of the invention include those which bind to one or 
more FATPs of mouse or other mammalian species. In a preferred embodiment, the 
antibodies specifically bind to a naturally occurring FATP of humans. The antibodies 
can be used in methods to detect or to purify a protein of the present invention or a 
30 portion thereof by various methods of immunoaffinity chromatography, to inhibit the 
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function of a protein in a method of therapy, or to selectively inactivate an active site, 
or to study other aspects of the structure of these proteins, for example. 

The antibodies of the present invention can be polyclonal or monoclonal. The 
term antibody is intended to encompass both polyclonal and monoclonal antibodies. 
5 Antibodies of the present invention can be raised against an appropriate immunogen, 
including proteins or polypeptides of the present invention, such as an isolated or 
recombinant FATP1, FATP2, FATP3, FATP4, FATP5, FATP6, mtFATP, ceFATPa, 
ceFATPb, scFATP or portions thereof, or synthetic molecules, such as synthetic 
peptides (e.g., conjugated to a suitable carrier). Preferred embodiments are antibodies 
10 that bind to any of the following: hsFATPl, hsFATP2, hsFATP3, hsFATP4 ? 

hsFATPS or hsFATP6. The immunogen can be a polypeptide comprising a portion 
of a FATP and having at least one function of a fatty acid transport protein, as 
described herein. 

The term antibody is also intended to encompass single chain antibodies, 

15 chimeric, humanized or primatized (CDR-grafted) antibodies and the like, as well as 
chimeric or CDR-grafted single chain antibodies, comprising portions from more 
than one species. For example, the chimeric antibodies can comprise portions of 
proteins derived from two different species, joined together chemically by 
conventional techniques or prepared as a single contiguous protein using genetic 

20 engineering techniques (e.g., DNA encoding the protein portions of the chimeric 

antibody can be expressed to produce a contiguous protein chain. See, e.g., Cabilly et 
ah, U.S. Patent No. 4,816,567; Cabilly et al, European Patent No. 0,125,023 Bl; 
Boss et al., U.S. Patent No. 4,816,397; Boss et al, European Patent No. 0,120,694 
Bl; Neuberger, M.S. et al, WO 86/01533; Neuberger, M.S. et al, European Patent 

25 No. 0,194,276 Bl; Winter, U.S. Patent No. 5,225,539; Winter, European Patent No. 
0,239,400 Bl; Queen et al, U.S.* Patent No. 5,585,089; and Queen et al, European 
Patent No. EP 0 451 216 Bl. See also, Newman, R. et al, BioTechnology, 70:1455- 
1460 (1992), regarding primatized antibody, and Ladner et al, U.S. Patent No. 
4,946,778 and Bird, R.E. et al, Science, 242:423-426 (1988) regarding single chain 

30 antibodies.) 
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Whole antibodies and biologically functional fragments thereof are also 
encompassed by the term antibody. Biologically functional antibody fragments 
which can be used include those fragments sufficient for binding of the antibody 
fragment to a FATP to occur, such as Fv, Fab, Fab* and F(ab ; ) 2 fragments. Such 
5 fragments can be produced by enzymatic cleavage or by recombinant techniques. For 
instance, papain or pepsin cleavage can generate Fab or F(ab') 2 fragments, 
respectively. Antibodies can also be produced in a variety of truncated forms using 
antibody genes in which one or more stop codons have been introduced upstream of 
the natural stop site. For example, a chimeric gene encoding a F(ab T ) 2 heavy chain 

10 portion can be designed to include DNA sequences encoding the CH X domain and 
hinge region of the heavy chain. 

Preparation of immunizing antigen (whole cells comprising FATP on the cell 
surface or purified FATP), and polyclonal and monoclonal antibody production can 
be performed using any suitable technique. A variety of methods have been 

15 described (See e.g., Kohler et al, Nature, 256: 495-497 (1975) and Eur. J. Immunol 
6: 511-519 (1976); Milstein et al, Nature 266: 550-552 (1977); Koprowski et al, 
U.S. Patent No. 4,172,124; Harlow, E. and D. Lane, 1988, Antibodies: A Laboratory 
Manual, (Cold Spring Harbor Laboratory: Cold Spring Harbor, NY); Chapter 1 1 In 
Current Protocols In Molecular Biology \ Vol. 2 (containing supplements up through 

20 Supplement 42, 1998), Ausubel, F.M. et al, eds., (John Wiley & Sons: New York, 
NY)). Generally, a hybridoma can be produced by fusing a suitable immortal cell 
line (e.g., a myeloma cell line such as SP2/0) with antibody producing cells. The 
antibody producing cells, preferably those obtained from the spleen or lymph nodes, 
can be obtained from animals immunized with the antigen of interest. Immunization 

25 of animals can be by introduction of whole cells comprising fatty acid transport 
protein on the cell surface. The fused cells (hybridomas) can be isolated using 
selective culture conditions, and cloned by limiting dilution. Cells which produce 
antibodies with the desired specificity can be selected by a suitable assay (e.g., 
ELISA). 

30 Other suitable methods of producing or isolating antibodies (including human 

antibodies) of the requisite specificity can used, including, for example, methods 
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which select recombinant antibody from a library (e.g., Hoogenboom et al 9 
WO 93/06213; Hoogenboom et al 9 U.S. Patent No. 5,565,332; WO 94/13804, 
published June 23, 1994; and Dower, WJ. et aL, U.S. Patent No. 5,427,908), or 
which rely upon immunization of transgenic animals (e.g., mice) capable of 
5 producing a full repertoire of human antibodies (see e.g., Jakobovits et aL, Proc. 
Natl Acad. ScL USA, 90: 2551-2555 (1993); Jakobovits et aL, Nature, 362:255-258 
(1993); Lonberg et al, U.S. Patent No. 5,569,825; Lonberg et al, U.S. Patent No. 
5,545,806; Surani et al, U.S. Patent No. 5,545,807; and Kucherlapati, R. et aL, 
European Patent No. EP 0 463 151 Bl). 
10 Another aspect of the invention is a method for directing an agent to cardiac 

muscle. The differential expression of FATP6 in cardiac muscle but not in other 
tissue types allows for the specific targeting of drugs, diagnostic agents, tagging 
labels, histological stains or other substances specifically to cardiac muscle. A 
targeting vehicle can be used for the delivery of such a substance. Targeting vehicles 
15 which bind specifically to FATP6 can be linked to a substance to be delivered to the 
cells of cardiac muscle. The linkage can be, for instance, via one or more covalent 
bonds, or by high affinity non-covalent bonds. A targeting vehicle can be an 
antibody, for instance, or other compound (e.g., a fatty acid or fatty acid analog) 
which binds to FATP6 with high specificity. 
20 Targeting vehicles specific to the heart-specific protein FATP6 have in vivo 

(e.g., therapeutic and diagnostic) applications. For example, an antibody which 
specifically binds to FATP6 can be conjugated to a drug to be targeted to the heart 
(e.g., a cardiac glycoside to treat congestive heart failure, or p-adrenergic agents, 
sodium channel blockers or calcium channel blockers to treat arrhythmias). A 
25 substance (e.g., a radioactive substance) which can be detected (e.g., a label) in vivo 
can also be linked to a targeting vehicle which specifically binds to a heart-specific 
protein such as FATP6, and the conjugate can be used as a labeling agent to identify 
cardiac muscle cells. 

Targeting vehicles specific to FATP6 find further applications in vitro. For 
30 example, an FATP6-specific targeting vehicle, such as an antibody (a polyclonal 
preparation or monoclonal) which specifically binds to FATP6, can be linked to a 
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substance which can be used as a stain for a tissue sample (e.g., horseradish 
peroxidase) to provide a method for the identification of cardiac muscle in a sample, 
as can be used in embryology studies, for example. 

In a similar manner, an agent can be directed to the liver of a mammal, as 
5 FATP5 is expressed in liver but not in other tissue types. A targeting vehicle which 
specifically binds to FATP5 can be conjugated to a drug for delivery of the.drug to 
the liver, such as a drug to treat hepatitis, Wilson's disease, lipid storage diseases and 
liver cancer. As with targeting vehicles specific to FATP6, targeting vehicles specific 
to FATP5 can be used in studying tissue samples in vitro. 

10 The invention also relates to compositions comprising a modulator of F ATP 

function. The term "modulate" as used herein refers to the ability of a molecule to 
alter the function of another molecule. Thus, modulate could mean, for example, 
inhibit, antagonize, agonize, upregulate, downregulate, induce, or suppress. A 
modulator has the capability of altering function of its target. Such alteration can be 

15 accomplished at any stage of the transcription, translation, expression or function of 
the protein, so that, for example, modulation of a target gene can be accomplished by 
modulation of the DNA or RNA encoding the protein, and the protein itself. 

Antagonists or agonists (inhibitors or enhancers) of the FATPs of the 
invention, antibodies that bind a FATP, or mimetics of a FATP can be employed in 

20 combination with a non-sterile or sterile carrier or carriers for use with cells, tissues 
or organisms, such as a pharmaceutical carrier suitable for administration to a 
mammalian subject. Such compositions comprise, for instance, a media additive or a 
therapeutically effective amount of an inhibitor or enhancer compound to be 
identified by an assay of the invention and a pharmaceutically acceptable carrier or 

25 excipient. Such carriers may include, but are not limited to, saline, buffered saline, 
dextrose, water, ethanol, surfactants, such as glycerol, excipients such as lactose and 
combinations thereof. The formulation can be chosen by one of ordinary skill in the 
art to suit the mode of administration. The chosen route of administration will be 
influenced by the predominant tissue or organ location of the FATP whose function is 

30 to be inhibited or enhanced. For example, for affecting the function of FATP4, a 

preferred administration can be oral or through a tube inserted into the stomach (e.g., 
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direct stomach tube or nasopharyngeal tube), or through other means to accomplish 
delivery to the small intestine. The invention further relates to diagnostic and 
pharmaceutical packs and kits comprising one or more containers filled with one or 
more of the ingredients of the aforementioned compositions of the invention. 
5 Compounds of the invention which are FATPs, FATP fusion proteins, FATP 

mimetics, FATP gene-specific antisense poly- or oligonucleotides, inhibitors or 
enhancers of a FATP may be employed alone or in conjunction with other 
compounds, such as therapeutic compounds. The pharmaceutical compositions may 
be administered in any effective, convenient manner, including administration by 

10 topical, oral, anal, vaginal, intravenous, intraperitoneal, intramuscular, subcutaneous, 
intranasal, transdermal or intradermal routes, among others. In therapy or as a 
prophylactic, the active agent may be administered to an individual as an injectable 
composition, for example as a sterile aqueous dispersion, preferably isotonic. 

Alternatively, the composition may be formulated for topical application, for 

15 example, in the form of ointments, creams, lotions, eye ointments, eye drops, ear 

drops, mouthwash, impregnated dressings and sutures and aerosols, and may contain 
appropriate conventional additives, including, for example, preservatives, solvents to 
assist drug penetration, and emollients in ointments and creams. Such topical 
formulations may also contain compatible conventional carriers, for example cream 

20 or ointment bases, and ethanol or oleyl alcohol for lotions. 

In addition, the amount of the compound will vary depending on the size, age, 
body weight, general health, sex, and diet of the host, and the time of administration, 
the biological half- life of the compound, and the particular characteristics and 
symptoms of the disorder to be treated. Adjustment and manipulation of established 

25 dose ranges are well within the ability of those of skill in the art. 

A further aspect of the invention is a method to identify a polymorphism, or 
the presence of an alternative or variant allele of a gene in the genome of an organism 
(of interest here, genes encoding FATPs). As used herein, polymorphism refers to 
the occurrence of two or more genetically determined alternative sequences or alleles 

30 in a population. A polymorphic locus may be as small as a base pair. Polymorphic 
markers include restriction fragment length polymorphisms, variable number of 
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tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide repeats, 
trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and insertion 
elements such as Alu. The first identified alleleic form, or the most frequently 
occurring form can be arbitrarily designated as the reference (usually, "wildtype") 
5 form, and other allelic forms are designated as alternative (sometimes, "mutant" or 
"variant"). Dipolid organisms may be homozygous or heterozygous for allelic forms. 

An "allele" or "allelic sequence" is an alternative form of a gene which may 
result from at least one mutation in the nucleotide sequence. Alleles may result in 
altered mRNAs or polypeptides whose structure or function may or may not be 
1 0 altered. Any given gene may have none, one, or many allelic forms (polymorphism). 
Common mutational changes which give rise to alleles are generally ascribed to 
natural deletions, additions, or substitutions of nucleotides. Each of these types of 
changes may occur alone, or in combination with the others, one or more times in a 
given sequence. 

1 5 Several different types of polymorphisms have been reported. A restriction 

fragment length polymorphism (RFLP) is a variation in DNA sequence that alters the 
length of a restriction fragment (Botstein et aL, Am. J. Hum. Genet. 32:314-331 
(1980)). The restriction fragment length polymorphism may create or delete a 
restriction site, thus changing the length of the restriction fragment. RFLPs have 

20 been widely used in human and animal genetic analyses (see WO 90/13668; WO 

90/1 1369; Donis-Keller, Cell 57:319-337 (1987); Lander etaL, Genetics 727:85-99 
(1 989)). When a heritable trait can be linked to a particular RFLP, the presence of 
the RFLP in an individual can be used to predict the likelihood that the individual 
will also exhibit the trait. 

25 Other polymorphisms take the form of short tandem repeats (STRs) that 

include tandem di-, tri- and tetra-nucleotide repeated motifs. These tandem repeats 
are also referred to as variable number tandem repeat (VNTR) polymorphisms. 
VNTRs have been used in identity and paternity analysis (US 5,075,217; Armour et 
aL, FEBSLett. 507:113-115 (1992); Horn etaL, WO 91/14003; Jeffreys, EP 

30 370,71 9), and in a large number of genetic mapping studies. 



BNSDOCID: <WO 0121795A3_IA> 



WO 01/21795 



PCT/US00/25891 



-65- 



Other polymorphisms take the form of single nucleotide variations between 
individuals of the same species. Such polymorphisms are far more frequent than 
RFLPs, STRs (short tandem repeats) and VNTRs (variable number tandem repeats). 
Some single nucleotide polymorphisms occur in protein-coding sequences, in which 
5 case, one of the polymorphic forms may give rise to the expression of a defective or 
other variant protein and, potentially, a genetic disease. Other single nucleotide 
polymorphisms occur in noncoding regions. Some of these polymorphisms may also 
result in defective protein expression (e.g., as a result of defective splicing). Other 
single nucleotide polymorphisms have no phenotypic effects. 

10 Many of the methods described below require amplification of DNA from 

target samples and purification of the amplified products. This can be accomplished 
by PCR, for instance. See generally, PCR Technology, Principles and Applications 
for DNA Amplification (ed. H.A. Erlich), Freeman Press, New York, NY, 1992; PCR 
Protocols: A Guide to Methods and Applications (eds. Innis, et al,), Academic Press, 

15 San Diego, CA, 1990; Mattila et al., Nucleic Acids Res. 19:4967 (1991); Eckert et al, 
PCR Methods and Applications 7:17 (1991); PCR (eds. McPherson et ah, IRS Press, 
Oxford); and US 4,683,202. 

Other suitable amplification methods include the Iigase chain reaction (LCR) 
(see Wu and Wallace, Genomics 4:560 (1989); Landegren et aL, Science 241:1011 

20 (1988)), transcription amplification (Kwoh et al, Proc. Natl Acad. Sci. USA 

86:1113 (1989), self-sustained sequence replication (Guatelli et al, Proc. Natl Acad. 
Set USA 57:1874 (1990), and nucleic acid based sequence amplification (NASBA). 
The latter two amplification methods involve isothermal reactions based on 
isothermal transcription, which produce both single stranded RNA (ssKNA) and 

25 double stranded DNA (dsDNA) as the amplification products in a ratio of about 30 or 
100 to 1, respectively. 

Another aspect of the invention is a method for detecting a variant allele of a 
human FATP gene, comprising preparing amplified, purified FATP DNA from a 
reference human and amplified, purified, FATP DNA from a "test" human to be 

30 compared to the reference as having a variant allele, using the same or comparable 
amplification procedures, and determining whether the reference DNA and test DNA 
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differ in DNA sequence in the FATP gene, whether in a coding or a noncoding 
region, wherein, if the test DNA differs in sequence from the reference DNA, the test 
DNA comprises a variant allele of a human FATP gene. The following is a 
discussion of some of the methods by which it can be determined whether the 
5 reference FATP DNA and test FATP DNA differ in sequence. 

Direct Sequencing. The direct analysis of the sequence of variant alleles of 
the present invention can be accomplished using either the dideoxy chain termination 
method or the Maxam and Gilbert method (see Sambrook et al, Molecular Cloning: 
A 

10 Laboratory Manual 2nd ed., Cold Spring Harbor Press, New York 1989; Zyskind et 
al, Recombinant DNA Laboratory Manual, Acad. Press, 1988)). 

Denaturing Gradient Gel Electrophoresis. Amplification products generated 
using the polymerase chain reaction can be analyzed by the use of denaturing gradient 
gel eletrophoresis. Different alleles can be identified based on the different sequence- 

15 dependent strand dissociation properties and electrophoretic migration of DNA in 
solution (chapter 7 in Erlich, ed. PCR Technology, Principles and Applications for 
DNA Amplification, W.H. Freeman and Co., Ne>v York, 1992). 

Single-strand Conformation Polymorphism Analysis. Alleles of target 
sequences can be differentiated using single-strand conformation polymorphism 

20 analysis, which identifies base differences by alteration in electrophoretic migration 
of single stranded PCR products, as described in Orita et al, Proc, Natl Acad, ScL 
USA 86:2766-2770 (1989). Amplified PCR products can be generated as described 
above, and heated or otherwise denatured, to form single-stranded amplification 
products. Single-stranded nucleic acids may refold or form secondary structures 

25 which are partially dependent on the base sequence. The different electrophoretic 
mobilities of single-stranded amplification products can be related to base-sequence 
differences between alleles of target sequences. 

Detection of Binding by Protein That Binds to Mismatches. Amplified DNA 
comprising the FATP gene or portion of the gene of interest from genomic DNA, for 

30 example, of a normal individual is prepared, using primers designed on the basis of 
the DNA sequences provided herein. Amplified DNA is also prepared, in a similar 
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manner, from genomic DNA of an individual to be tested forbearing a 
distinguishable allele. The primers used in PGR carry different labels, for example, 
primer 1 with biotin, and primer 2 with 32 P. Unused primers are separated from the 
PCR products, and the products are quantitated. The heteroduplexes are used in a 

5 mismatch detection assay using immobilized mismatch binding protein (MutS) bound 
to nitrocellulose. The presence of biotin-labeled DNA wherein mismatched regions 
are bound to the nitrocellulose via MutS protein, is detected by visualizing the 
binding of streptavidin to biotin. See WO 95/12689. MutS protein has also been 
used in the detection of point mutations in a gel-mobility-shift assay (Lishanski, A. et 

10 al, Proc. Natl Acad. Set USA 91:2614-261% (1994)). 

Other methods, such as those described below, can be used to distinguish a 
FATP allele from a reference allele, once a particular allele has been characterized as 
to DNA sequence. 

Allele-specific probes. The design and use of allele-specific probes for 

15 analyzing polymorphims is described by e.g., Saiki et al, Nature 524:163-166 

(1986); Dattagupta, EP 235,726, Saiki, WO 89/1 1548, Allele-specific probes can be 
designed so that they hybridize to a segment of a target DNA from one individual but 
do not hybridize to the corresponding segment from another individual due to the 
presence of different polymorphic forms in the respective segments from the two 

20 individuals. Hybridization conditions should be sufficiently stringent that there is a 
significant difference in hybridization intensity between alleles, and preferably an 
essentially binary response, whereby a probe hybridizes to only one of the alleles. 
Some probes are designed to hybridize to a segment of target DNA such that the 
polymorphic site aligns with a central position (e.g., in a 15-mer at the 7 position; in a 

25 16-mer, at either the 8 or 9 position) of the probe. This design of probe achieves 
good discrimination in hybridization between different allelic forms. 

Allele-specific probes are often used in pairs, one member of a pair showing a 
perfect match to a reference form of a target sequence and the other member showing 
a perfect match to a variant form. Several pairs of probes can then be immobilized on 

30 the same support for simultaneous analysis of multiple polymorphisms within the 
same target sequence. 
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Allele-specific Primers. An allele-specific primer hybridizes to a site on 
target DNA overlapping a polymorphism, and only primes amplification of an allelic 
form to which the primer exhibits perfect complementarity. See Gibbs, Nucleic Acid 
Res. 1 7:2427-2448 (1989). This primer is used in conjunction with a second primer 
5 which hybridizes at a distal site. Amplification proceeds from the two primers, 

resulting in a detectable product which indicates the particular allelic form is present. 
A control is usually performed with a second pair of primers, one of which shows a 
single base mismatch at the polymorphic site and the other of which exhibits perfect 
complementarity to a distal site. The single-base mismatch prevents amplification 
10 and no detectable product is formed. The method works best when the mismatch is 
included in the 3'-most position of the oligonucleotide aligned with the polymorphism 
because this position is most destabilizing to elongation from the primer (see, e.g., 
WO 93/22456). 

Gene Chips. Allelic variants can also be identified by hybridization to nucleic 
15 acids immobilized on solid supports (gene chips), as described, for example, in WO 
95/1 1995 and U.S. Patent No. 5,143,854, both of which are incorporated herein by 
reference. WO 95/11995 describes subarrays that are optimized for detection of a 
characterized variant allele. Such a subarray contains probes designed to be 
complementary to a second reference sequence, which is an allelic variant of the first 
20 reference sequence. 

The present method is illustrated by the following examples, which are not 
intended to be limiting in any way. 

EXAMPLES 
Materials and Methods 
25 The following Materials and Methods were used in the work described in 

Examples 1-5. 

Sequence Alignment of FATP Clones. The DNA sequence for mouse FATP1 
was obtained from the National Center for Biotechnology Information nonredundant 
database. cDNAs for mmFATP2, 3, 4, and 5 were obtained by screening mouse 
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expression libraries (purchased from GIBCO/BRL, Rockville, MD) with probes 
derived from the cloned expressed sequence tags (ESTs) (Research Genetics, 
Huntsville, AL). Full-length clones were obtained for mmFATP2 and 5 and partial 
sequences for mmFATP3 and 4. The sequences described herein have been 
5 deposited in the GenBank database (Accession Nos. FATP2, AF072760; FATP3, 
AF072759; FATP4, AF072758; FATP5, AF072757). 

Neither FATP2 nor FATP5 contains an in-frame stop codon upstream of the 
putative initiator methionine; initiator methionines were assigned by homology with 
that in mmFATPl and by the presence of a signal sequence immediately after it. The 

1 0 Mycobacterium tuberculosis, Caenorhabditis elegans, and Saccharomyces cerevisiae 
sequences were present in the dbEST database as part of the sequencing projects for 
these organisms. Sequences were aligned utilizing a ClustalX algorithm and the 
resulting alignment exported to SeqVu. Homologous amino acid substitutions are 
boxed in Figure 1 and were determined using the Dayhoff 250 method with a 50% 

1 5 homology cutoff 

Cell Transfection and LCFA Uptake. COS cells were cotransfected using the 
DEAE-dextran method with the mammalian expression vector pCDNA 3.1 
(Invitrogen., Carlsbad, CA) expressing the gene for CD2 (pCDNA-CD2) in 
combination with either a pCDNA 3,1 or pCMVSPORT2 (GIBCO/BRL, Rockville, 

20 MD) expression vector containing one of the murine or nematode FATP genes 

(pCDNA-mmFATPJ, pCDNA-FA TP27pCMVSPORT-FATP5, pCDNA-ceFA TPb). 
Two days after transfection, cells were assayed for CD2 expression with a 
phycoerythrin-coupled anti-CD2(PE-CD2) monoclonal antibody (PharMingen, 
Franklin Lakes, NJ), and fatty acid uptake was assayed with a BODEPY-labeled fatty 

25 acid analogue (Molecular Probes). Briefly, cells were washed twice with PBS 
(phosphate buffered saline) and stained with PE-CD2 at 4°C for 30 min in PBS 
containing 10% fetal calf serum. They were then washed three times with PBS/fetal 
calf serum for 5 min followed by an incubation for 2 min at 37°C in fatty acid uptake 
solution, which contained 0.1 \iM BODIPY-FA and 0.1% fatty acid-free BSA 

30 (bovine serum albumin) in PBS (Schaffer, J.E. & Lodish, HLF. (1994) Cell 79 All- 
436). After 2 min, the cells were washed four times with ice-cold PBS/0.1% BSA. 
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The cells were then removed from the plates with PBS containing 5 mM EDTA and 
resuspended in PBS containing 10% fetal calf serum and 10 mM EDTA. PE-CD2 
and BODIPY-FA fluorescence were measured using a FACScan (Becton Dickinson, 
Franklin Lakes, NJ). COS cells were gated on forward scatter (FSC) and side scatter 
5 (SS). Cells exhibiting more than 300 CD2 fluorescence units (dsim) representing 
15% of all cells were deemed CD2 positive and their BODIPY-FA fluorescence was 
quantitated. 

E. co/f-Based LCFA Uptake Assay. The full-length coding region of mtFATP 
and a control protein, the mammalian transcription factor TFE3, were subcloned into 

1 0 the inducible, prokaryotic expression vector pET (Novagen, Madison, WI). 

Expression was induced with 1 mM isopropyl P-D-thiogalactoside (B?TG) for 1 hour, 
or cells were left uninduced. Cells were washed in PBS/0. 1% BSA and resuspended 
in 1 ml PBS/0.1% BSA containing 0.1 ^tM [ 3 H]palmitate (NEN) at 37°C. Uptake 
was stopped after the indicated incubation time by transferring the cells onto filter 

15 paper using a cell harvester (Brandel, Bethesda, MD), Filters were washed 

extensively with ice-cold PBS/0.1% BSA, and [ 3 H]palmitate was quantitated by 
scintillation counting. 

Northern Blots. Northern blot analysis of murine FATP expression was 
done using poly(A) mRNA blots (Clontech, Palo Alto, CA). Probes of each of the 

20 FATPs were derived from the 3 1 untranslated regions of each gene and were <60% 
identical in sequence. Probes were labeled by random priming (Boehringer 
Mannheim, Indianapolis, IN) and hybridized at 65°C. Blots were extensively washed 
in 0.2% SSC/0.1% SDS at 65°C 

Generation of Phylogenetic Trees, Complete and partial sequences for FATP 

25 genes from human, rat, mouse, puffer fish, Drosophila melanogaster, C. elegans, S. 
cerevisiae, and M tuberculosis were aligned using CiustalX. A homologous region 
of 48 amino acids (residues 472-519 in mmFATPl) from all of the genes was used to 
determine phylogenetic relationship within CiustalX. Based on these data a 
phylogenetic tree was generated using Tree View PPC (Figure 5). 

30 Nomenclature. It is proposed that the FA TP genes be given a species specific 

prefix (mm, Mus musculus; hs, Homo sapiens; mt, M tuberculosis; dm, D. 
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melanogaster\ ce, C. elegans, sc, S. cerevisiae) and numbered such that mammalian 
homologues in different species share the same number but differ in their prefix. 
Since the two C. elegans genes cannot be paired with a specific human or mouse 
FATP, they have been designated ceFATPa and ceFATPb. 

5 Example 1 : Identification of Novel Mammalian FA TPs 

The National Center for Biotechnology Information EST database was 
screened, using the mouse FATP protein sequence (mmFATPl), to identify novel 
FATPs. This strategy led to the identification of more than 50 murine EST sequences 
which could be assembled into five distinct contiguous DNA sequences (contigs). 

10 One contig was identical to the previously cloned FATP, which has been renamed 
FATPL Another, which has been renamed FATP2, is the murine homologue of a rat 
gene previously identified by others as a very long chain acyl-CoA synthase 
(Uchiyama, A., Aoyama, T., Kamijo, K., Uchida, Y., Kondo, N., Orii, T. & 
Hashimoto, T. (1996) J. Biol Chem. 277:30360-30365). The other three contigs 

15 represented novel genes (FATP 3, 4, and J). Full-length clones for FATP '2 and 

FATPS and nearly complete sequences for FATP3 and 4 (Figure 1) were obtained by 
screening cDNA libraries made from mouse day 10.5 embryos and adult liver. Also 
identified were human homologues for each of the murine genes in the EST database. 
A sixth human gene was also identified; whether this gene is also present in the 

20 mouse will require additional studies. Map positions are given in Tables 2 and 3. 

The genetic loci for all of the human genes, with the exception of FATPS 
which was already mapped as an unknown EST, were determined using the radiation 
hybrid 

panels. The map positions given below show the distance (in centiRays) from the 
25 closest framework marker. As a guideline, there are approximately 300 kb/cR. 
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Table 2. Mapping Data for Human Genes 

hsFATPl Chromosome Chrl9 

places 13.35 cR from WI-6344 (lod>3.0) 
hsFATP2 Chromosome Chrl5 
5 places 4.92 cR from D15S126 0od>3.0) 

hsFATP3 Chromosome Chrl 

places .13.24 cR from WI-2862 (lod>3.0) 
hsFATP4 Chromosome Chr9 

places 7.80 cR from WI-9685 (lod>3.0) 
10 hsFATPS unknown EST previously mapped to near D19S418 
hsFATP6 Chromosome Chr5 

places 1.41 cR from WI-4907 (lod>3.0) 

The mouse map is an internal backcross panel consisting of 188 mouse 
backcross DNA's plus 4 controls (B6, Spretus, Fl, Water). The backcross was 

15 constructed by crossing B6 by Spretus animals and then crossing those Fl f s back to 
B6. Mapping is accomplished by taking advantage of recombinational events during 
meiosis, and the use of PCR primers to detect the differences (by size or re-annealing 
events) at any given locus between the B6 and Spretus allele. 

For the purposesof mapping, a novel set of primers (gene of interest) 

20 is used to amplify from all 188 DNA's and then typed as being a B6 ("B") or a 

Spretus ("S")> This string of B's and S f s is entered into the Map Manager program, 
which does a best fit calculation by comparing the string of 188 typings from the gene 
of interest to all loci already extant in the panel, for all 20 chromosomes. The gene of 
interest is then assigned to a particular area on a particular chromosome according to 

25 a number of parameters, including the minimalization of double cross-overs, and the 
highest LOD scores. Indicated in Table 3 are distances to the closest markers on 
either side of the FATP locus. 
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Table 3. Mapping Data for Mouse Genes 



mmFATPl Chromosome 8 

places 2.82 cM from D8MU132 (lod 43 .4) and 1 .8 1 cM from D8Mit74 
(lod 43.5) 
5 mmFATP2 Chromosome 2 

places 1.29 cM from D2Mit258 (lod 47,9) and 1.75 cM from D2NDS3 
(lod 44.9) 
mmFATP3 Chromosome 3 

places 2.54 cM from D3Mit22 (lod 29.5) and 19.62 cM from D3Mit42 
10 (lod 13.6) 

mmFATP4 Chromosome 2 

places 13.78 cM from D2Mitl (lod 22.9) and 3.85 cM from D2Mit65 
(lod 41.9) 
mmFATPS Chromosome 7 
15 . places 7.28 cM proximal of D7Mit21 (lod 28.3) 



Example 2: Assessment of Function 

The ability of the newly identified mouse genes to function as fatty acid 
transporters was assessed using a fluorescence-activated cell sorting-based assay. 
COS cells were transiently cotransfected with expression vectors encoding the cell 

20 surface protein CD2 and either mmFATP 1 , mmFATP2, or mmFATPS, respectively. 
Two days after transfection, COS cells were stained with an antibody to CD2 and 
then incubated with a BODIPY-labeied fatty acid [BODIPY-FA, (Schaffer, J.E. & 
Lodish, HJF. (1994) Cell 7P:427-436)]. The cells were then washed extensively, 
lifted off the dish, and analyzed by fluorescence-activated cell sorting. As judged by 

25 the number of CD2-positive cells, the transfection efficiency was approximately 20- 
30%. Fatty acid uptake was quantitated in the transiently transfected COS cells by 
measuring the BODIPY-FA fluorescence of the CD2-positive cells. Expression of 
CD2 had no effect on fatty acid uptake as shown by the finding that COS cells 
expressing only the transfected CD2 cDNA (CD2-positive) had the same low level of 
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BODIPY-FA uptake as did untransfected (CD2-negative) control cells (Figure 2A, 
control). In COS cells cotransfected with CD2 and mmFATPl, mmFATP2, or 
mmFATPS, uptake of BODIPY-FA by the transfected (CD2-positive) cells was 
increased between 15- to 90-fold over control (CD2 cDNA only) ceils (Figures 2A- 
5 2D). 

Example 3: Expression Patterns of Murine FATPs 

Expression patterns of members of the murine FATP gene family were 
characterized by Northern blot analysis; to avoid cross-hybridization, the probes used 
were from the 3* untranslated region of these genes, which are less than 60% identical 
10 in sequence. The expression pattern of FATP 1 agrees with that previously found 
(Schaffer, J.E. & Lodish, H.F. (1994) Cell 79: All '-436). Here, expression was seen 
primarily in heart and kidney. FATP2 is expressed almost exclusively in liver and 
kidney, which corresponds to the reported tissue distribution of the rat homologue 
[very long chain acyl-CoA (VLACS)] as assessed by Western blotting (Uchiyama, A., 
15 Aoyama, T., Kamijo, K., Uchida, Y., Kondo, N., Orii, T. & Hashimoto, T. (1996) J. 
Biol Chem. 277:30360-30365). FATP3 is present in lung, liver, and testis. FATPS 
is expressed only in liver and cannot be detected in other tissues even when the blot is 
overexposed. The human homologue of FATPS is also liver specific and is not 
expressed in a wide array of other tissues tested, including fetal liver. 

Example 4: FATPs Are Evolutionarily Conserved 

The EST database was searched, using sequences conserved among the five 
murine FATP genes, for FATP genes in other organisms. Two homologues were 
found in C. elegans and one in M tuberculosis. One of the C elegans genes was 
cloned from a cDNA library and expressed in COS cells, as described for the murine 
FATPs. Overexpression of the nematode FATP resulted in a 15-fold increase of 
BODIPY-FA uptake compared with control cells (Figure 3). The mycobacterial 
FATP gene was isolated from a phage library and assessed for its ability to facilitate 
fatty acid uptake. E. coli transformed with a prokaryotic, isopropyl |3-D- 
thiogalactoside-inducible expression vector containing the mycobacterial FATP gene 



20 



25 
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demonstrated a significant increase in the rate of [ 3 H]palmitate uptake after induction, 
compared with uninduced bacteria oi£. coli transformed with a control protein 
(Figure 4). Novel FATP genes were also identified in F. rubripes (puffer fish) and D. 
melanogaster. 

5 Examples: Phylogenetic Tree of FATPs 

Faergeman et al. (Faergeman, N.J., DiRusso. C.C., Elberger, A., Knudsen, J. 
& Black, P. N. (1997) J. Biol Chem. 272:8531-8538) identified three regions of very 
strong conservation between the scFATP and mmFATPl genes. The sequences of the 
FATPS were compared over a 31 1-amino acid FATP "signature sequence" which 

10 includes these conserved regions corresponding to amino acids 246-557 in 

mmFATPl (underlined in Figure 1). When compared with the National Center for 
Biotechnology Information nonredundant database, only one region of the "FATP 
signature sequence" shows significant homology to other proteins. This small stretch 
of amino acids (underlined in Fig. 1) is an AMP-binding motif found in a multitude 

15 of other proteins, such as acyl-CoA synthase, several Co A lipases, and gramicidin S 
synthetase component H (Schaffer, J.E. & Lodish, H.F. (1994) Cell 7P:427-436). The 
relevance of this motif to fatty acid transport is unclear. Other highly conserved 
regions among the FATPs, including long stretches of amino acids >90% identical 
from mycobacteria to humans, are not found in any other class of proteins, A 48- 

20 amino acid segment of the FATP signature sequence was used to construct a 

phylogenetic tree (Figure 5). Each of the human and mouse genes form their own 
branch; hsFATP6, which as yet has no murine homologue, is most closely related to 
hsFATP3 and mmFATP3. As expected, rnVLACS is closer in sequence to 
mmFATP2 than to hsFATP2. The FATP genes of invertebrates i.e., C. elegans and 

25 D. melanogaster \ are most closely related to each other. Surprisingly, the 

mycobacteral gene is more closely related to the human and mouse FATPS genes than 
to the FATPs of any of the lower organisms. Whether this reflects coevolution of the 
mycobacterial and human genes awaits further study. 
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Materials and Methods 

The following materials and methods were used in the work described in 
Examples 6-10. 



Isolation of full-length human FATP1 and 4 
5 Full-length clones encoding human FATP1 and human FATP4 were 

identified by searching databases for sequences similar to murine FATP1-5 coding 
regions using the BlastX algorithm (Altschul et al., J. Mol Biol 215: 403-410, 1990). 

A concatamer of nucleotide sequences comprising the coding sequences of 
mmFATPl (Genbank Accession U15976), mmFATP2, mmFATP3 (SEQ ID NO:6), 

10 mmFATP4 (SEQ ID NO:8) and mmFATPS (SEQ ID NO:10) was used to search the 
Millennium database using the BLASTX algorithm. Sequences with a score >150 
were evaluated for whether they represented known FATP coding sequences. 

Human clones with similarity to the 5* end of murine FATP sequences were 
sequenced completely. Clones encoding full-length human FATP1 were obtained 

15 from a heart cDNA library constructed in the mammalian expression vector pMET7 
(Tartaglia et al, Cell 83: 1263-1271, 1995). Clones encoding full-length human 
FATP4 were obtained from a spleen cDNA library constructed in the mammalian 
expression vector pMET7. 



Isolation of full-length human FATP6 
20 Several clones encoding human FATP6 were identified by searching public 

databases as described above. Five clones were analyzed further by restriction 
digestion and DNA sequencing- One of these clones (Genbank Accession # 
AA4 12064) appeared to be full-length and its entire insert was sequenced. 



DNA Sequence Analysis 
25 Sequences were aligned with the DNAStar program using the Clustal method. 

Hydrophobicity plots were generated with DNA Strider using the Kyte Doolittle 
method. 
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In situ hybridization 

Tissues were collected from 8 week old C57/B16 mice. Tissues were fresh 
frozen, cut on a cryostat at 10 \lm thickness and mounted on Superfrost Plus slides 
(VWR). Sections were air dried for 20 minutes and then incubated with ice cold 4% 
5 paraformaldehyde (PFA)/phosphate buffered saline (PBS) for 10 minutes. Slides 
were washed 2 times 5 minutes with PBS, incubated with 0.25% acetic anhydride/1 
M triethanolamine for 1 0 minutes, washed with PBS for 5 minutes and dehydrated 
with 70%, 80%, 95% and 100% ethanol for 1 minute each. Sections were incubated 
with chloroform for 5 minutes. Hybridizations were performed with 35 S-radiolabeled 

10 (5x1 0 7 cpm/ml) cRNA probes generated from the 3' untranslated regions of mouse 
FATPs by PCR followed by in vitro transcription in the presence of 50% fonnamide, 
10% dextran sulfate, Ix Denhardt ? s solution, 600 mM NaCl, 10 inM DTT, 0.25% 
SDS and 10 Jig/ml tRNA for 18 hours at 55°C. After hybridization, slides were 
washed with 10 mM Tris-HCl pH 7.6, 500 mM NaCl, 1 rnM EDTA (TNE) for 10 

1 5 . minutes, incubated in 40 \±g/m\ RNase A in TNE at 37°C for 30 minutes, washed in 
TNE for 10 minutes, incubated once in 2x SSC at 60°C for 1 hour, once in 0.2x SSC 
at 60°C for 1 hour, once in 0.2x SSC at 65°C for 1 hour and dehydrated with 50%, 
70%, 80%, 90% and 100% ethanol. Localization of mRNA transcripts was detected 
by dipping slides in Kodak NBT-2 photoemulsion and exposing for 7 days at 4°C, 

20 followed by development with Kodak Dektol developer. Slides were counter stained 
with haematoxylon and eosin and photographed. Controls for the in situ 
hybridization experiments include the use of a sense probe which showed no signal 
above background in all cases. 

Northern Blotting 

25 Human mRNA blots were obtained from Invitrogen or Clontech. PCR 

fragments from the 3 1 untranslated regions of human FATPs were used as probes. 
Blots were probed with 32 P-labeled DNA probes using the Rapid-Hyb buffer 
(Amersham, Buckinghamshire, UK) according to the manufacturer's instructions. 
Cell transfection and LCFA uptake. COS cells were cotransfected, using 

30 lipofectamine (GBBCO BRL, Rockville, MD) according to the manufacturer's 
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instructions, with the mammalian expression vector pCDNA3.1 (Invitrogen, 
Carlsbad, CA) expressing the gene for CD2 in combination with a pMET7 expression 
vector (Tartaglia et al 9 Cell, 83:1263-1271, 1995) containing hsF ATP 1 (pMET7- 
hsFATPl) or hsFATP4 (pMET7-hsFATP4) or pMET7 alone. Two days after 
5 transfection, cells were assayed for CD2 expression with a phycoerythrin-coupled 
anti-CD2 (PE-CD2) monoclonal antibody (PharMingen, Franklin Lakes, NJ), and 
fatty acid uptake was assayed with a BODIPY-labeled fatty acid analog (Molecular 
Probes) as described above. 

Example 6: Determination of Expression of mmFATPs 

10 mmFATP4, and to lesser extent mmFATP2, are expressed at high levels in 

the brush border layer of the small intestine. 

Cell transfection and LCFA uptake. COS cells were cotransfected, using 
lipofectamine (GIBCO BRL, Rockville, MD) according to the manufacturer's 
instructions, with the mammalian expression vector pCDNA3.1 (Invitrogen, 

15 Carlsbad, CA) expressing the gene for CD2 in combination with a pMET7 expression 
vector (Tartaglia et al, Cell, 83:1263-1271, 1995) containing hsFATPl (pMET7- 
hsFATPl) or hsFATP4 (pMET7-hsFATP4) or pMET7 alone. Two days after 
transfection, cells were assayed for CD2 expression with a phycoerythrin-coupled 
anti-CD2 (PE-CD2) monoclonal antibody (PharMingen, Franklin Lakes, NJ), and 

20 fatty acid uptake was assayed with a BODIPY-labeled fatty acid analog (Molecular — 
Probes) as described above. 

Absorption of dietary fat requires transport of free fatty acids across the apical 
membrane of epithelial cells in the small intestine. Previous studies suggested that 
this transport is protein-mediated; however, the transport protein had not yet been 

25 identified. In situ hybridization was performed on each of the three regions of the 
small intestine — duodenum, jejunum and ileum — as well as the colon, using probes 
from the 3' untranslated regions of mmFATPl, mmFATP2, mmFATP3, mmFATP4 
and mmFATPS, to determine whether any of the mouse FATPs are expressed in the 
small intestine. It was expected that a protein involved in fatty acid absorption would 

30 be expressed in the epithelial cells of the small intestine, but absent from the colon. 
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Expression of nunFATPs in the jejunum was identical to that in the ileum in 
all cases. High levels of mmFATP4 mRNA were present in the epithelial cells of the 
jejunum and ileum, and lower, but significant, amounts were detected in the epithelial 
cells of the duodenum. Significantly, FATT4 mRNA was absent from other cell 
5 types of the small intestine and no FATP4 mRNA could be detected in any of the 
cells of the colon. FATP2 mRNA was present in the epithelial cells of the duodenum 
at a level similar to that of FATP4, but was present at lower levels in the jejunum and 
ileum. No signals above background were detected for mmFATPl, mmFATP3 and 
mmFATPS in any of the intestinal tissues. mrnFATP3 and FATP5 were clearly 

10 detectable by in situ hybridization in adult liver and mmFATPl could be detected in a 
variety of tissues on a whole embryo in situ, indicating that the FATP1 , 3, and 5 
probes were working. 

mmFATP4 expression is predominant in the small intestine compared to the 
other organs of the mouse embryo. In the small intestine, FATP4 expression is 

15 limited to differentiated enterocytes, while no signal is detected in the connective 
tissue or the undifferentiated epithelial cells in the crypts. Differentiated enterocytes 
are known to be the cells that mediate the uptake of fatty acids. FATP4 is specifically 
and strongly expressed in the epithelial cells of adult murine duodenum and ileum but 
not colon. Other FATPs, such as FATPS, are not expressed in the small intestine. 

20 Thus, FATP4 is the major FATP in the mouse small intestine. Given its high level of 
expression, it is likely that FATP4, and to a lesser extent FATP2, play an important 
role in the absorption of fatty acids. 

mmFATP2, and mmFATPS are expressed in hepatocytes 

Northern analysis of mmFATP2, mmFATP3, mmFATP4 and mmFATPS 

25 showed expression in the liver. To determine whether these proteins are present in 
hepatocytes or other cells types present in liver homogenates, in situ hybridizations 
were performed. mmFATP2, and mmFATPS mRNA was clearly present in 
hepatocytes, and was not concentrated in other cell types such as endothelial cells or 
macrophages. No signal above background was detected for mmFATPl in any of the 

30 cell types in the liver, consistent with the results of the Northern blotting. 
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Example 7: Isolation and Sequence Analysis of Full-length Human FATP1 and Full- 
length Human FATP4 

To identify human cDNA clones encoding FATP family members, 
Millennium databases were searched for sequences similar to murine FATP 1-5 

5 coding regions. Two clones were analyzed in detail; inspection of the entire DNA 
sequence of these two clones showed that they encode the human orthologs of 
mmFATPl and mm FATP4, respectively. These two clones were designated 
hsFATPl and hsFATP4 3 and their DNA and predicted protein sequences are shown 
in Figures 44A-44C and 45, and 50A-50C and 51. hsFATPl is predicted to encode a 

10 646 amino acid, 71 kD protein with multiple membrane-spanning domains (Figure 
28A). HsFATP4 is predicted to encode a 643 amino acid, 72 kD protein with 
multiple membrane spanning domains (See Figure 29 A). A comparison of the DNA 
sequences of mouse and human FATP1 and mouse and human FATP4 (Figures 30A- 
30B and 31 A-31B) shows that the mouse and human orthologs are 85% (FATP1) and 

15 87% (FATP4) identical to each other within the coding sequences given in these 

figures. At the amino acid level, hsFATPl and hsFATP4 are -90% identical to their 
respective mouse orthologs within the coding region shown in these figures (Figures 
32 and 33). The sequence identities between mouse and human FATP1 and FATP4 
are considerably higher than the ones observed between different FATP family 

20 members within one species (~40%-60%) and are present in the N-terminal part of 
— the protein, a region that is poorly conserved between different FATP family 

members. This high degree of sequence conservation clearly demonstrates that the 
newly identified human FATPs are orthologs of mouse FATP1 and FATP4 rather 
than novel FATP family members. 

25 Table 4 is an identity/similarity matrix comparing the amino acid sequences 

of FATP1 and 4 from human and mouse. This shows that the gene whose sequence 
is shown in Figure 43 A is indeed human FATP4, since it is 91% identical with the 
murine FATP4 but only 62% identical with the closest related human FATP, which is 
FATP1. 
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Table 4 


Identity/Similarity Matrix 




hsFATP4 


mmFATP4 


hsFATPl 


mmFATPl 


hsFATP4 




93.2 


72.3 


72.0 


mmFATP4 


91.0 




71.2 


71.1 


hsFATPl 


61.9 


61.0 




92.4 


mmFATPl 


60.7 


59.6 


89.5 





Example 8: Isolation and Sequence Analysis of Full-length Human FATP6 

A search of EST databases identified a set of overlapping human sequences 
that were similar to FATPs, but did not have a clear mouse ortholog. One of these 

10 EST clones was found to encode a full-length cDNA. The entire insert of this clone 
was sequenced and designated hsFATP6. The DNA and predicted protein sequences 
of hsFATP6 are shown in Figures 54A-54C and 55. HsFATP6 is predicted to encode 
a 619 amino acid, 70 kD protein with multiple membrane-spanning domains (Figure 
35A). A comparison of the amino acid sequences of hsFATP6 with other human 

15 FATPs shows about 37% identity to either hsFATPl or hsFATP4 (Figure 36). This 
-degree of sequence identity is similar to what is observed between different mouse 
FATPs. The phylogenetic analysis described above clearly demonstrates that 
hsFATP6 is a member of the FATP family, but not an ortholog of any of the mouse 
FATPs. Comparisons were done with "ALIGN" (E. Myers and W. Miller, "Optimal 

20 Alignments in Linear Space " CABIOS 4:1 1-17 (1988) using standard settings. 

Example 9: Tissue Distribution of Human FATPs 

The tissue distribution of human FATPs was assessed by Northern blotting. 
Human FATP3 was expressed in a large variety of tissues. In contrast, human 
FATPS was present at high levels in the liver, but was undetectable in all other 
25 tissues examined. Thus, both hsFATP3 and hsFATPS recapitulate the expression 
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pattern of their mouse orthologs (see above). HsFATP6 is a novel FATP with no 
mouse ortholog as yet. Northern blotting shows that hsFATP6 is expressed at high 
levels in the heart, but is undetectable in other tissues, including skeletal and smooth 
muscle. This tissue distribution suggests that human FATP6 performs an important 
5 role in energy metabolism in the heart; blocking FATP6-mediated fatty acid transport 
may therefore be beneficial for a number of heart diseases, e.g., ischemic heart 
disease. 

To identify the major FATP expressed in the human small intestine, Northern 
blotting was performed on a blot containing mRNA from human stomach, jejunum, 

10 ileum, colon, rectum and lung. hsFATPS and hsFATP6 were undetectable in any of 
these tissues. FATP5 is only expressed in liver and FATP6 only in heart. hsFATP2 
was weakly expressed in the colon, and an even weaker signal was detectable in 
jejunum, ileum and lung lanes. hsFATP3 was expressed well in the lung, but was 
only weakly expressed in the other tissues tested. Importantly, no difference was seen 

15 in the expression of hsFATP3 between small intestine and stomach or colon, 

suggesting that the expression observed is not related to fatty acid absorption in the 
small intestine. hsFATP4 was clearly expressed in both jejunum and ileum; 
expression was significantly lower in the colon and was absent in the stomach. This 
expression pattern is consistent with a major role for FATP4 in absorption of fatty 

20 acids in the human gut. 



Example 10: Expression of hsFATPl and hsFATP4 Promotes Transport of Fatty 
Acids 

COS cells were cotransfected using lipofectamine with the mammalian 
expression vector pCDNA-CD2 in combination with one of the FATP-containing 

25 expression vectors (pMET7-hsF ATP 1 or pMET7-hsFATP4) or an insertless 
expression vector (pMET7, control) as described in Materials and Methods for 
Examples 6-10. COS cells were gated on forward scatter and side scatter. Cells 
exhibiting more than 400 CD2 fluorescence units representing -30% of all cells were 
deemed CD2-positive. The percent of CD2-positive cells exhibiting a BODD? Y* 

30 fluorescence of >300 is plotted for the three different vectors tested (Figure 37). 
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Example 11: Stable Expression of Human FATP4 in 293 Cells 

Stable cell lines were generated as follows. A DNA fragment containing the 
entire hsFATP4 coding sequence as well as 100 nucleotides of 5' and 50 nucleotides 
of 3' untranslated region was inserted into the vector pIRES-neo (Clontech, Palo Alto, 
5 CA) using standard cloning techniques. The resulting construct or a vector control 
(pIRES-neo) was transfected into 293 cells using the iipofectamine method (Gibco 
BRL, Rockville, MD) according to the manufacturer's directions. Cells that had 
taken up the DNA were selected with 1 mg/ml G418 (Gibco BRL, Rockville, MD). 
Single colonies were picked 1 to 2 weeks after transfection and grown in medium 

10 containing 0.8 mg/ml G418. Colonies were screened for the ability to take up fatty 
acids by measuring uptake of a fluorescently labeled fatty acid (BODIPY-FA). About 
40 colonies transfected with the pIRES-neo containing FATP4 and -20 colonies 
transfected with pIRES-neo control were analyzed. All 20 of the vector control 
clones showed amounts of BODIFY-FA uptake similar to each other and to 

15 untransfected 293 cells. In contrast, among the 40 FATP4 transfected clones, 3 had a 
5- to 10-fold increased BODIPY-FA uptake compared to any of the vector controls, 
and a large number (~20) showed an approximately two-fold increase in BODEPY- 
FA levels. This distribution is consistent with FATP4 conferring increased fatty acid 
uptake in these cells. One of the cell lines with the highest amount of BODIPY-FA 

20 uptake was selected to be used for measuring uptake of tritiated fatty acid. 

The uptake of tritiated oleate over time by either FATP4 expressing or control 
cells was assayed over time. Expression of FATP4 increases the rate of fatty acid 
uptake by over 3 -fold, demonstrating that FATP4 is, like the other FATPs, a 
functional fatty acid transporter (Figure 38). . 

25 Example 12: Immuno-staining with FATP4-Specific Antiserum 

A polyclonal antiserum against the C-terminus of mmFATP4 was raised using 
a GST-fusion protein having mmFATP4-specific amino acid sequence 552-643 
(AVASP...GEEKL). In western blot experiments, the purified antibody reacted 
strongly with a synthetic peptide matching the C-terminus of mmFATP4, but not with 

30 a corresponding region of mxnFATP2, mmFATP3, or mmFATPS. The mmFATP4 
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specific polyclonal antiserum detects, in western blot experiments with enterocyte 
lysates from 3 different mice, a -70 kDa protein, which is in accordance with 
mmFATP4's predicted molecular weight of 72 kDa. The binding is specific for 
mmFATP4, since it can be completely abolished by preincubation of the antiserum 
5 with the GST-fusion peptide used to raise the antibody. 

Immunofluorescence experiments were performed using the anti-mmFATP4 
antiserum on fresh frozen sections of murine small intestine. The antibody binding 
demonstrates strong expression of mmFATP4 in enterocytes, confirming the results 
of the in situ hybridization experiments. At higher magnifications it is apparent that 

1 0 mmFATP4 is expressed at the apical side of the enterocyte, indicating that the 

transporter is present in the brush border membrane, which is known to mediate the 
uptake of fatty acids from the intestinal lumen. 

Immuno-electron microscopy studies were perfoimed on fresh frozen murine 
intestinal cells. The gold particles used, appearing as black specks on the electron 

15 micrographs, indicate the subcellular localization of mmFATP4 to be on the 
microvilli of the enterocyte. It can be seen from electron micrographs that 
mmFATP4 is localized exclusively in membranes, preferentially the apical plasma 
membrane, confirming that it is indeed a membrane protein. 

Methods for Immunofluorescence and Immunogold Electron Microscopy 
20 Unfixed mouse small intestine was washed with Hank's buffered salt solution 

containing 1 mM EDTA, infused with 2.3 M sucrose solution, and embedded in 
O.C.T., 4583 compound. The material was thick sectioned (15 ^lM - 40 jiM). The 
sections were washed in PBS containing 1% BSA and 0.075% glycine to block non- 
specific binding. Primary and secondary antibodies were diluted in PBS with 10% 
25 FCS and incubated for Ih. The sections were mounted in 90% glycerol/PBS 

containing 1 mg/ml paraphenylinediamine, and examined with a Bio-Rad MRC 600 
confocal, mounted on a Zeiss Axioscop. 

For the immunogold labeling, the tissue was fixed with 2% paraformaldehyde 
in PBS for 10 minutes, after which it was cryoprotected by infiltration with 2.3 M 
30 sucrose in 0, 1 M phosphate buffer (pH 7.4) containing 20% polyvinylpyrrolidone, 
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and then mounted on aluminum ciyo nails and frozen in liquid nitrogen (Tokuyasu, 
K.T., J. Microscop. 743:139-149, 1986). Ultrathin sections were collected on 
carbon/formvar-coated nickel grids. The primary antibody (anti-FATP4) was. diluted 
in 10% FCS in PBS and incubated overnight at 4° C, followed by donkey anti-rabbit 
5 IgG-gold (12 nm) (Jackson Labs) for lh. The sections were stained in 2% neutral 
uranyl acetate (20 minutes) and absorption stained with 2% uranyl acetate in 0.2% 
methylcellulose containing 3.2% polyvinyl alcohol. The sections were examined 
with a Philips EM 410 electron microscope. 

Example 13: Inhibition of Fatty Acid Uptake Specific to FATP4 Demonstrated in 
1 0 Isolated Mouse Enterocytes 

Phosphorothioate derivatives of the following oligonucleotides were 
synthesized: 

FATP4-AS2 CCCCCACCAGAGAGGCTCC (SEQ ID NO: 1 03) 

FATP4-AS2MM CC ACCCCCGGAAAGCCTGC (SEQ ID NO: 1 04) 
1 5 FATP4-S2 GGAGCCTCTCTGGTGGGGG (SEQ ID NO: 1 05) 



FATP4 AS2 is the antisense oligo; it is designed to be complementary to the 
sequence extending from nucleotide 10 to nucleotide 28 of the mouse FATP4 coding 
sequence. FATP4-AS2MM is a control oligo; in the oligo every third nucleotide was 
changed creating mismatches; the overall nucleotide composition is identical to 

20 FATP4-AS2 (same number of G, A, T, C). FATP4-S2 is the sense control. 

Enterocytes were isolated from the small intestine of mice and incubated for 
48h in tissue culture (Figure 40) either without oligonucleotides (squares) or with 100 
JIM FATP4 specific sense (circles) or antisense (diamonds) oligonucleotides. The 
uptake over time of 25 |iM oleate was then measured. While the FATP4 sense 

25 oligonucleotide did not significantly influence the uptake, the antisense 
oligonucleotide inhibited fatty acid uptake by -50%. 

The effect of either FATP4 sense, antisense or mismatch sequence 
oligonucleotides on the uptake of fatty acids was measured in enterocytes. Isolated 
enterocytes were incubated with increasing concentrations of FATP4 antisense 
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oligonucleotides (solid bars in Figure 41), or a mismatch control oligonucleotide with 
identical nucleotide composition (stippled bars), or with 100 \lM of the FATP4 
sense-oligonucleotide (lined bar). The medium for this incubation was Dulbecco's 
modified Eagle's medium with 4.5 g/L glucose, 1 mM sodium pyruvate, 0.01 mg/ml 
5 human transferrin and 10% fetal bovine serum. After 48 hours of incubation the 
uptake of oleate by enterocytes was measured over a 5 minute time interval. 
Measurements were done in quadruplicate. The uptake assay was done in Hank's 
buffered salt solution with 10 mM taurocholate. Only the enterocytes given FATP4 
antisense oligonucleotide showed a concentration dependent decrease of fatty acid 

1 0 uptake, inhibiting it at a 100 \xM concentration by ~ 50%. This effect was FATP4 
specific, since only the antisense oligonucleotide which can bind to the FATP4 
mRNA and block its translation inhibited uptake, but not a control oligonucleotide 
differing only in the sequence but not the nucleotide content, ruling out a toxic or 
otherwise nonspecific inhibitory effect of this oligonucleotide due to its chemical 

1 5 composition. 

As a further control experiment, the uptake of oleate was measured along with 
the uptake of methionine in the same cultured enterocytes. Antisense 
oligonucleotide, mismatch sequence oligonucleotide, or no oligonucleotide was 
added to a concentration of 100 \xM to cultures of enterocytes. After incubation for 

20 48 hours, the uptake of both 3 H-labeled oleate and 35 S-labeled methionine was 

assayed. Results are shown in Figure 42. Fatty acid uptake is at the left side of the 
paired bars; methionine uptake is on the right side of the paired bars. The fact that 
amino acid uptake was not influenced by the antisense oligonucleotide treatment 
further supports the conclusion that the antisense oligonucleotide causes a specific 

25 reduction in translation of FATP4-speci£c mRNA, 

Example 14: mmFATP2 Is Expressed in Proximal Renal Tubule Epithelium 

Northern analysis showed that mmFATPl, mmFATP2, and mmFATP4 are 
present in the kidney. In situ hybridization (methods as for Example 6) was 
performed to determine which cell type(s) of the kidney these mRNAs are expressed 
30 in. mmFATPl mRNA was present in virtually all cells throughout the kidney with 
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no obvious preference for a particular cell type. In contrast, mntFATP2 was 
expressed only in the renal cortex. Within the cortex, expression of mmFATP2 was 
restricted to the epithelial cells of the proximal renal tubules. The primary function 
of proximal rena] tubule cells is the reabsorption of filtered salts and nutrients (e.g., 
5 glucose), a process that requires mitochondrial oxidation and that can utilize fatty 
acids as energy substrates. Based on the localization of mmFATP2, it is possible that 
mmFATP2 is important for reabsorption in the kidney by allowing uptake of an 
energy source (fatty acids) from the blood into renal epithelial cells. Alternatively, if 
fatty acids need to be reabsorbed in the kidney, similarly to glucose, FATP2 could be 
10 involved in the reabsorption of fatty acids. Determination of the subcellular 
localization of FATP2 will distinguish between these two possibilities. 
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obtained from a human bone library constructed in the mammalian expression vector 
pMET7 (Tartaglia, L.A. et aL 9 Cell 83: 1263-1271, 1995). To identify human cDNA 
clones encoding FATP family members, databases were searched for sequences 
similar to murine FATP1-5 coding regions. One clone was found to encode the 
5 human ortholog of mmFATP3 and was designated hsFATP3. The DNA and 
predicted protein sequences of hsFATP3 are shown in Figures 94A and 94B. 
hsFATP3 is predicted to encode a 702 amino acid 75.6 kD protein with multiple 
membrane-spanning domains. A comparison of the DNA sequences of mouse and 
human FATP3 shows that the mouse and human orthologs are 81% identical to each 

10 other within the coding region. At the amino acid level, hsFATP3 is - 86% identical 
to mm FATP3 within the coding region. The sequence identities between mouse and 
human FATP3 are considerably higher than those observed between different FATP 
family members within one species (-40%) and are present in the N-terminal part of 
the protein, a region that is poorly conserved between different FATP family 

1 5 members. 

Example 16: Substrate Specificity of Fatty Acid Transport in hsFATP-Transfected 
Clones 

Using a mammalian expression vector, we generated 40 stable 239 cell lines 
expressing hsFATP4 and 20 cell lines transfected with a control plasmid. The ability 

20 of the different cell lines to take up FA, as assessed by uptake assays using the 

fluorescently labeled Bodipy-palmitate, correlated well with their FATP4 expression 
levels determined by Western blotting (FIG. 95). All 20 vector control clones showed 
amounts of Bodipy-FA uptake similar to each other and to untransfected 239 cells. In 
contrast, among the 40 FATP4 transfected clones, a large number (-20) showed an 

25 approximately 2-fold increase in Bodipy-FA uptake compared to any of the vector 
controls, and three had a 5- to 10-fold increase in Bodipy-FA uptake. 

Several of the cell lines with the highest amount of Bodipy-FA uptake as well 
as isolated primary enterocytes were used to measure the uptake of radiolabeled FAs. 
Short-term uptake by 293 cells and enterocytes of all FAs tested was linear (FIG. 97). 

30 hsFATP4 expression enhanced the rate of palmitate uptake approximately 3 fold over 
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293 cells transfected with vector alone (FIG. 97) and also accelerated the uptake of 
oleate but not of linolate, arachidonate, octanoate, butyrate or cholesterol (Table 6). 
Isolated primary enterocytes showed a similar preference for palmitate and oleate, 
and absence of transport of arachidonate, octanoate, and butyrate, but displayed a 
5 more robust transport of linolate and cholesterol than the transfected 293 cells. 

To further characterize the substrate specificity of FATP4, we measured the 
uptake by stably transfected 293 cells of 5 \lM Bodipy-FA in the presence of a 20 
fold molar excess (i.e., 100 jiM) of FAs, FA-derivatives and lipid soluble vitamins 
and hormones. Both saturated and non-saturated fatty acids containing 1 0 to 26 C 
1 0 atoms strongly competed for uptake of Bodipy-palmitate (FIG. 96 and Table 7) and 
thus are presumed to be substrates of FATP4. In contrast, fatty acids with eight or 
fewer C atoms did not compete and thus are presumed not to be FATP4 substrates. 
Similarly, esters of long chain FAs and other hydrophobic molecules tested had no 
effect on uptake of Bodipy-palmitate. 

1 5 LCFA Uptake Assays (Methods) 

Bodipy-FA uptake assays using FACS were performed, adapted to a 96-well 
format LCFA uptake assays with enterocytes or with stably transfected 293 cells 
were done as follows. Mixed micelles of radiolabeled FA (NEN) and taurocholate 
(Sigma) in HBS were generated by brief sonication at 37°C. Equal volumes of cells 

20 and micelle solution were mixed, resulting in a final FA concentration of 25 p.M for 
antisense assays and 10 JJ.M for substrate specificity assays. Final taurocholate 
concentration was 5 mM. Cells were incubated for the indicated amount of time at 
37°C. The assay was stopped by transferring the cells onto filter paper followed by 
extensive washes with ice-cold HBS containing 0.1% BSA using a cell harvester 

25 (Brandell). Incorporated oleate was then determined by P-scintillation counting 
(Beckman). 



Table 6 
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Uptake of Different Substrates by FATP4 Expressing Cell Lines and 
Enterocytes 



Fatty Acid 


293 Cells 
Control* 


293 Cells 

Stably 
Expressing 
FATP4 


FATP4 
specific 


Enterocytes* 


Palmitate 


564 


1695 


1131 


3036 


Oleate 


662 


1122 


459 


117 


Linolate 


640 


673 


33 


116 


Arachidonate 


3 


5 


2 


0 


Octanoate 


0 


0 


0 


5 


Butyrate 


0 


50 


50 


73 


Cholesterol 


319 


345 


26 


531 



Uptake of different substrates by enterocytes and by control 
and stable FATP4-expressing 293 cells. The rates of uptake 
for the indicated fatty acids was measured over 4 min taking 
measurements every 30 s. All fatty acids were at a 



15 concentration of 10 \lM in HBS containing 5 niM taurocholate. 

*Uptake measured as pmol/min JO 6 cells 



Table 7 

Competition of Bodipy-FA Uptake by FATP4 Expressing Cells 



Fatty Acids 


Formula 


Competition 


Butyric Acid 


C 4 H 8 0 2 
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Caproic Acid 


C 6 H 12 0 2 


- 


Caprylic Acid 


C 8 H l6 0 2 


- 


Capric Acid 


C 10 H 20 O 2 


++ 


Laurie Acid 


Ci 2 H 24 0 2 


-H- 


Myristic Acid 


Cl4 H 2 8 0 2 


+-f 


Palmitic Acid 


Ci6H 32 0 2 




Stearic Acid 


Ci 8 H 36 0 2 


+ 


Oleic Acid 




++ 


Linoleic Acid 


C|gH 32 0 2 


4H- 


Arachidic Acid 


C2oH 40 0 2 


H-f 


Lignoceric Acid 


C 2 4H 48 0 2 


-H- 


Cerotic Acid 


C26H52O2 


+4- 



Fatty Acid Derivatives 



Fatty Acids 


Formula 


Competition 


Palmitic Acid Methyl 
Ester 


CnH 34 0 2 




Stearic Acid Methyl Ester 


Ci9H3 8 0 2 




Oleic Acid Ethyl Ester 


C2oH3g0 2 




Oleic Acid Oley Ester 


^36^6802 




Oleoyl CoA 


C 39 H 68 N 7 0 I7 P 3 S 




Cholesteryl Oleate 


C45H 78 0 2 





Table 7 Continued 

Competition of Bodipy-FA Uptake by FATP4 Expressing Cells 
Lipid-Soluble Vitamins & Hormones 
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Fatty Acids 


Formula 


Competition 


T? ^tin/"*if* A^iH /T^rr*— Vitamin A"^ 






XjigUL* all/ 11 CI <Ji ^ V llcUIlllj. LJZ. j 


C H O 




Tocopherol (Vitamin E) 


C29H50O2 




3-Phyiylamenadione (Vitamin 
Kl) 


C3|H 46 0 2 




Prostaglandin E2 


^20^3205 





Competition for Bodipy-FA uptake by FATP4 expressing cells by 
different hydrophobic compounds. The uptake of 5 \iM Bodipy-FA, 



10 CI -Bodipy-Cl 2 was measured in the presence of a 20-fold molar 
excess (i,e, ? 100 \lM) of the indicated fatty acids or fatty acid 
derivatives. The maximal 1 00% inhibition was defined as the 
amount of Bodipy-FA incorporated in the presence of 200 JlM lauric 
acid which was on average 18% ± 5% that of untreated cells. 

15 -: 0% - 30% inhibition by the indicated substance 
±: 30% - 50% inhibition 
+: 50% - 70% inhibition 
++: 70% - 100% inhibition 
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Example 17: Identification and Characterization of the FATP5 Promoter 
METHODS 

BAC Isolation and Luciferase Constructs 

An arrayed BAC library was screened by PCR for FATP5 genomic clones. 

5 PCR primers designed by a program from the Whitehead Institute's Genome Center 
specifically amplified a single band of the correct size from mouse genomic DNA. 
Two putative BACs containing the FATP5 genomic sequence were identified and the 
presence of FATP5 sequence was confirmed by dot hybridization of the BAC with 
the mmFATPS cDNA. 

10 After isolation of positive BACs, large amounts of bacteria were grown and 

DNA prepared using a Qiagen maxi-prep kit (Qiagen, Venlo, The Netherlands). The 
BAC was digested with Sac I and ligated into pZero-2 (Invitrogen, Carlsbad, CA), 
Inserts containing mmFATPS genomic sequence were identified by screening colony 
lifts of the ligation with an a- 32 P-ATP radiolabeled, random primed (Boehringer- 

15 Mannheim, Indianapolis, IN) mmFATPS cDNA as a probe. Positive colonies were 
picked and restriction analysis with Sac I revealed them to contain an identical, large 
insert of 8-10 kb. Digestion of the Sac I fragment with BstX I yielded three pieces 
that were subsequently subcloned into pZero and sequenced using an ABI sequencer 
(Research Genetics). A 1.3 kb piece containing sequence immediately upstream of 

20 the FATP5 initiator methionine was subcloned into the Xho I and Bgl II sites of the 
promoter-less pGL3 luciferase reporter vector (Promega Corp., Madison, WI). 7 kb 
of additional upstream sequence was subcloned into the Xho I and Sac I sites of the 
prior construct to yield a final construct containing approximately 8 kb of genomic 
sequence upstream of the initiator methionine. Deletions of the FATP5 promoter 

25 were constructed using PCR with the 1 .3 promoter construct as the template. 

Products were amplified with primers containing Hind EI (5 s primer) and Xho I (3 1 
primer) sites using Elongase (Gibco, Rockville, MD). The resulting fragments were 
cut with Hind HI and Xho I and subcloned into the corresponding sites of the 
promoter-less pGL3 luciferase reporter vector. The internal 30 base pair deletions, 

30 GC box mutations, and 10 nucleotide linker scan were all created with the 
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Quickchange mutagenesis kit (Stratagene, La Jolla, CA) according to the 
manufacturer's instructions. At least two different bacterial colonies were picked for 
each construct. The inserts from both colonies were sequenced to check for 
unintended point mutations and both constructs were assayed for luciferase activity. 

5 Cell culture, Transfection, and Luciferase Measurements 

HepG2, Hep3B, HT1080, 3T3-L1, BOSC, and HACAT cells were grown in 
DMEM supplemented with 10% fetal calf serum, 1 x penicillin-streptomycin and 
glutamine (Gibco, Rockville, MD). Mink lung cells were grown in MEM 
supplemented with 10% fetal calf serum, 1 x minimal essential amino acids, 1 x 

10 penicillin-streptomycin and glutamine. The evening prior to transfection, cells were 
plated at 50-60% confluence in 24 well dishes. The following morning, cells were 
placed in 2 mis of fresh media and 250 pL of a CaP0 4 solution (Invitrogen, Carlsbad, 
CA) containing 2 pg of a luciferase reporter construct and 0.5 pg of pCMV-p-gal 
was added to the cells. pCMV-P-gal constitutively expresses p~galactosidase and 

15 was used to normalize transfection efficiency (Hua et al., 1998). After 12 hours, the 
cells were washed twice with DMEM and placed in fresh media. Thirty six hours 
later, the media over the cells was removed and 250 \iL of 1 x reporter lysis buffer 
(Promega Corp., Madison, WT) was added. After vigorous shaking for 15 minutes at 
room temperature, the supernatants were transferred to Eppendorf tubes and briefly 

20 centrifuged to remove particulates. 20 |iL from these tubes was used for 

determination of luciferase activity (Promega Corp., Madison, WT) and 20 \lL was 
used for the measurement of P-galactosidase activity (Clontech, Palo Alto, CA). All 
luciferase values were normalized to p-galactosidase to control for transfrection 
efficiency and expressed as relative luciferase units (RLU). For experiments 

25 comparing different cell lines, promoter activity was computed as a fold induction by 
dividing the RLU activity of either the -8 or -271 promoter constructs by the RLU 
activity a promoter-less construct. Each data point was done in triplicate and each 
experiment was repeated a minimum of three times. 



BNSDOC1D: <WO 0121795A3_IA> 



WO 01/21795 



PCT/USOO/25891 



-97- 

Northern Blots, Preparation of Nuclear Extracts, and Gel Shift Assays 

Human poly- A northern blots were purchased from a commercial vendor 
(Clontech, Palo Alto, CA) and probed with a piece of the human FATP5 3' 
untranslated region specific for FATP5. Nuclear lysates from HepG2 and BOSC 
5 cells were essentially prepared according to the method of Hua et al. and stored at - 
80°C (Hua et al., 1998), Probes for gel shift assays were end labeled using T4 
polynucleotide kinase (Boehringer-Mannheim, Indianapolis, IN) and gel purified. 
Gel shifts were performed at room temperature in 30 J-lL reactions comprised of 6 [iL 
5 X binding buffer (100 mM Tris 8.0, 300 mM KC1, 5 mM EDTA, 8 mM MgCl 2 , and 
10 36% glycerol), 0.5 p,L of 100 mM DTT, 1 |iL of 10 mg/ml BSA, 2 of 2 mg/ml 
poly dl/dC, and 5 \lL nuclear lysate. Ten minutes after the addition of nuclear lysate, 
40,000 cpm of 32 P-labeled probe were added. After 20 minutes at room temperature, 
loading dye was added and the reaction run on a 4% non-denaturing gel. 

RESULTS 

15 Human FATP5 mRNA is only expressed in adult liver 

We had previously reported that mniFATPS mRNA was only expressed in the 
liver (Hirsch et al., 1998). To determine if the human isoform of FATP5 was also 
liver specific, we performed northern analysis using a probe from the 3' transcribed 
but untranslated region of the human gene. Similar to the mouse homolog, hsFATPS 

20 is liver specific. Interestingly, hsFATPS was not expressed in fetal liver suggesting 
that it may be developmentally regulated. 



Identification of aFATPS promoter 

We next set out to determine the cis-acting elements responsible for liver 
specific expression of FATP5. We identified BACs containing the FATP5 genomic 
25 locus and subcloned a 10 kb Sac I fragment which was subsequently sequenced. The 
Sac I fragment contains approximately 8 kb of genomic sequence upstream of the 
FATP5 initiator methionine. Blast searches using the 5' end of the Sac I sequence 
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revealed that it contained coding sequence for an unknown gene immediately 
upstream of FATP5. Since the FATP5 promoter is unlikely to overlap the coding 
sequence of another gene, we hypothesized that the 1 0 kb Sac I fragment contained 
the FATP5 promoter. To test this hypothesis, 8 kb of genomic DNA upstream of the 

5 translational initiator of FATP was subcloned into the promoter-less pGL3 luciferase 
reporter vector. This construct was transiently transfected into the HepG2 liver cell 
line and luciferase activity was determined. The -8 kb piece of DNA resulted in a 35 
fold induction of luciferase activity when compared to a pGL3 vector without the 
FATP5 genomic sequence (FIG. 100). To determine if this activity reflected tissue 

10 specific transcription, the -8 kb luciferase reporter construct was transfected into a 
variety of additional cell types. While promoter activity was also detected in the 
Hep3b hepatoma cell line, non-liver cell lines did not express luciferase above the 
level of the promoter-less vector. Thus, the 8 kb upstream genomic element 
recapitulated liver specific expression in vitro. 

1 5 The FATP5 promoter resides within the 261 base pairs upstream of the initiator 
methionine and requires a single GC box 

To determine the cis-acting elements in the -8 kb of genomic sequence 
responsible for transcriptional activity, serial 5' deletions of the promoter were 
constructed and transfected into HepG2 cells. Surprisingly, greater than 90% of the 

20 -8 kb was dispensable for promoter activity. A construct containing only 261 base 
pairs upstream of the initiator methionine resulted in promoter activity equivalent to 
that of the -8 kb construct (FIG. 101). Identical results were obtained when the 
deletion series was transfected into Hep3b cells (data not shown). We next 
determined if promoter activity of a small genetic element was tissue specific. 

25 Transfection of a construct containing 271 base pairs upstream of the initiator 

methionine into a variety of cell lines essentially replicated the results of the -8 kb 
construct in that expression was observed only in liver derived cell lines (FIG. 102). 

Since deletion analysis revealed that bases between -261 and -218 were 
required for promoter activity, we closely examined this region for binding sites of 

30 known transcription factors and found the sequence GGGGCGGGG between 
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nucleotides -241 and -232 (FIG. 103 A). This sequence binds the Spl family of 
transcription factors and is termed a GC box. To determine if the activity of the -271 
construct required the GC box, we mutated the GC box. The first construct deleted 
nucleotides -241 to -222 which removed the GC box and additional downstream 
5 sequence which, although less optimal, might also bind the Spl family of 

transcription factors(SEQ ID NO.: 107). The second construct had three G to A point 
mutations in the GC box between nucleotides -241 to -232(SEQ ID NO.: 108). Such 
mutations had previously been shown to abolish transcriptional activity of GC boxes 
(Rodenburg et al., 1997). In contrast to the wild type -271 promoter, both of the 

10 mutated constructs were transcriptionally inactive in HepG2 cells (FIG. 103B). 

Identical results were also obtained in Hep3B cells (data not shown). This suggests 
that the GC box between -241 to -232 is essential for transcriptional activity of the 
FATP5 promoter. We next examined whether the sequences necessary for luciferase 
activity also bound proteins in nuclear extracts from HepG2 cells. Two different 

15 oligonucleotides were used for gel shift analysis. One oligonucleotide (AF-1) 

contained nucleotides -250 to -230(SEQ ID NO.: 1 1 1) and the other (AF-2) spanned 
nucleotides -260 to ~-200(SEQ ID NO.: 109) (FIG. 104). Both oligonucleotides 
yielded three significant complexes from HepG2 nuclear extracts. All complexes 
were specific as 100 fold excess of the same unlabeled oligonucleotide could compete 

20 for binding of the radiolabeled oligonucleotide. Mutant AF-1 oligonucleotides 

containing three point mutations in the GC box did not bind any proteins in HepG2 
nuclear extracts or compete for binding of nuclear proteins to the AF-1 or AF-2 
oligonucleotides (data not shown). Oligonucleotides AF-1 and AF-2 also bound 
recombinant Spl (Promega Corp, Madison, WI, data not shown). However, nuclear 

25 extract from BOSC cells, a kidney cell line, and HepG2 cells had identical patterns of 
complex formation (data not shown). 

Identification of novel sequences required for transcriptional activity of the FATP5 
promoter 

While the GC box between nucleotides 241 and 232 is essential for 
30 transcriptional activity, additional sequences downstream of the GC box might also 
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be required for transcription. To determine if such sequences existed, we created 30 
base pair internal deletions in the ~-271 construct downstream of the GC box. 
Constructs that had deletions in sequences between 240 and 180 nucleotides upstream 
of the FATP5 translational initiator had greatly reduced transcriptional activity in 
5 HepG2 cells (FIG. 105). To identify the specific sequences within this region 

required for FATP5 transcription, a 10 nucleotide linker (CTAAC AGGA G) (SEQ ID 
NO.: 113) was exchanged for wild type sequence within the context of the -271 base 
pair construct (FIG. 106). Inadvertently, the 210 to 200 construct had a single 
nucleotide insertion and the 190 to 180 construct had a two nucleotide insertion 

10 relative to the wild type sequence. However, several other linker constructs that also 
had equivalent insertions (230 to 220 or 170 to 160 for example) had high levels of 
luciferase activity. Thus the decrease in luciferase activity in the 190 to 180 and 210 
to 200 constructs is due to changes in the nucleotide sequence and not the result of 
the nucleotide additions. Transfection of these DNA into HepG2 cells revealed two 

1 5 regions important for transcription. Mutating sequences between nucleotides -210 
and ~-200 or between nucleotides -190 and -180 drastically reduced luciferase 
activity (FIG. 106). 

In both humans and mice, FATP5 is only expressed in the liver. To determine 
the promoter elements mediating liver specific transcription, we isolated a B AC 

20 encoding the mouse FATP5 genomic locus and sequenced 10 kb upstream of the 
transcriptional start. Since this 10 kb of genomic DNA did not contain either a 
TATA box or GC rich regions found in TATA-less promoters, FATP5 may utilize 
non-canonical sequences for transcription initiation. Unfortunately, attempts to 
identify the transcriptional start using primer extension were unsuccessful, perhaps 

25 due to secondary structure in the 5* UTR. Since we did not unambiguously determine 
the transcriptional start site, the nucleotide numbering in all of the promoter 
constructs refers to the distance from the translational start codon. 



GC box and Spl transcription factors 

Since another gene was situated approximately 8 kb upstream - of the FATP5 
30 initiator methionine, we hypothesized that promoter elements were likely within this 
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region of DNA. A luciferase reporter construct containing this sequence was 
transcriptionally active in two liver cell lines but was inactive in cell lines derived 
from lung, muscle, kidney, skin, or fibroblasts. Deletion analysis of the -8 kb reporter 
construct revealed that the FATP5 promoter was contained within the 261 nucleotides 
5 upstream of the initiator methionine. Promoter activity in this -261 base pair piece 
required the presence of a single GC box. Gel shift assays with oligonucleotides 
containing this GC box revealed the presence of three distinct complexes that 
required a functional GC box for binding. GC boxes bind the Spl family of 
transcription factors and the multiple complexes could reflect the binding of different 

10 members of the Spl protein family or different post-translational modifications of 
Spl in HepG2 cells (Rodenburg et al., 1997). Although the Spl family of 
transcription factors is widely expressed, Spl has been shown to be important for the 
transcription of several liver specific genes and is upregulated in liver after birth 
(Rodenburg et ah, 1997). In some cases, Spl will facilitate the binding of a tissue 

15 specific transcription factor to DNA. For example, Spl binding to DNA enhances the 
binding of C/EBP(3 to an adjacent site in the liver specific CYP2D5 promoter (Lee et 
al., 1994). Since the C/EBPP binding site in the CYP2D5 promoter is suboptimal, 
C/EBPP binding to this site requires the presence of Spl or nuclear extract. A 
similar situation could occur in the FATP5 promoter. Although mutations in the 10 

20 nucleotides downstream of the GC box had no effect on luciferase activity, we did 

not test mutations immediately upstream of the GC box for effects on promoter 

activity. It is also possible that Spl might bind an unknown liver specific 
transcription factor and recruit it to the FATP5 promoter. Although, there is no 
experimental evidence for this, Spl has recently been shown to bind to a 

25 transcriptional activator so additional interacting proteins are possible (Ryu et al. 5 
1999). 

Other liver specific transcription factors 

Alternatively^ since the Spl gene family is important for the transcription of 
many genes which are not liver specific, liver specific promoter elements in the 
30 FATP5 promoter might be located elsewhere (Boisclair et al., 1993: Rongnoparut et 
al., 1991; Sorensen and Wintersberger, 1999). Analysis of the sequence downstream 
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of the GC box using TFSearch 

(http://pdapl.trc.rwq).orjp/research/db/TFSEARCH.htm]) did not reveal any 
additional transcription factor binding sites of relevance (Heinemeyer et aL, 1999; 
Heinemeyer et al., 1998). Further, we were unable to visually identify binding sites 
5 for known liver specific transcription factors in this sequence (De Simone and 
Cortese, 1992; Hanson and Reshef, 1997; Lai, 1992). Thus, we looked 
experimentally for additional promoter elements by mutating the sequence 
downstream of the GC box and identified two additional sites downstream of the GC 
box that were essential for FATP5 transcription. The sequences of these sites do not 

10 conform to any known transcription factor binding sites suggesting the either novel 
proteins bind these elements or that these elements bind known proteins in a novel 
manner. Preliminary gel shift data using oligonucleotides spanning these site 
suggests that these two elements may comprise a binding site for a single complex. 
Further additional data suggests that the complex which binds to these two sites 

15 interacts with the GC box 30 base pairs upstream. Interestingly, we noted a 

palindromic sequence equally split between these two sites (FIG. 107). Since many 
transcription factors bind palindromic DNA elements, it is intriguing to speculate that 
these two sequences contribute to the binding site for a novel transcription factor. 
Current investigations are focused on identifying the proteins binding to these novel 

20 elements and how this element interacts with the GC box. 

Several studies have shown that the FATP gene family is regulated by a 
variety of substances including LPS, cytokines, insulin, and diet (Frohnert et al., 
1999; Hui et al., 1998; Memon et al., 1999). Especially intriguing has been a recent 
report that FATP1 is upregulated by PPARa ligands in liver cell lines (Martin et al., 

25 1997; Motojima et al., 1998). Since fatty acids maybe endogenous activators of 

PPAR's, transcriptional regulation of FATP1 by PPAR's may represent a physiologic 
feedback loop (Gottlicher et al., 1992; Grimaldi et al., 1999; Schoonjans et al., 1996). 
Given that liver also expresses FATP5, it will be interesting to see whether this genes 
is also regulated by PPARa and the tools developed here should help address this 

30 question. 
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Several factors make the FATP5 promoter amenable to further study. First, 
liver specific transcription of FATP5 can be recapitulated using immortalized cell 
lines in vitro. Second, the minimal required promoter element that confers liver 
specific transcription is very small. Third, transcriptional activity of this promoter is 
5 very robust. Thus, further study of the FATP5 promoter may provide additional 

insight into the mechanisms of liver specific transcription and regulation of the FATP 
gene family. 

Example 18: 
Materials and Methods 

10 Polyclonal antibodies were raised against proteins containing the N-terminal 

domain of mouse FATP2 or the C-terminal domain of mouse FATP5 fused to 
glutathione-S-transferase (GS). Tissues for immunofluorescence were collected from 
8 week old mice and a 2 year old chimpanzee. Tissues were fresh frozen, cut on a 
cryostat and mounted on slides. Immunofluorescence was performed as previously 

15 described (Stahl et al., 1999). Pictures were taken on a Zeiss confocal microscope. 

To determine FATP2 expression in the gall bladder, mouse gall bladder was 
incubated with anti-FATP2 antibody as the primary antibody and rhodamine-labeled 
anti-rabbit IG as the secondary antibody. FATP2 antibody clearly stained the gall 
bladder epithelium, but did not result in significant staining of other cell types. 
20 (Figure 108) 

To further study FATP2 expression, chimpanzee liver was costained with 
anti-FATP2 antibody(green) and anti CD31 antibody(red). CD31 is expressed on 
endothelial cells and is used as a marker for blood vessels. FATP2 immunoreactivity 
was present in large patches which overlap with CD31 positive areas, suggesting that 
25 FATP2 protein was present in the space of Diss, the area where hepatocytes exchange 
nutrients with the blood. This implicates FATP2 in the uptake of fatty acids into 
hepatocytes. In addition to areas which overlap with CD31 immunoreactivity, 
FATP2 protein was also present on the cell surface of hepatocytes in a small bead 
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pattern. Immunoelectronmicroscopy of similar sections showed that FATP2 
immunoreactivity was localized in the walls of bile caniculi which are formed by the 
liver cells. (Figure 109) The presence of FATP2 in bile caniculi in the liver as well as 
its presence in the gall bladder epithelium suggests a role for FATP2 in either 
5 absorption or secretion of fatty acids into the bile. The levels of free fatty acids in the 
bile have been associated with the frequency of all stone formation. 

To further study FATP5 expression, chimpanzee liver was costained with 
anti-FATP5 antibody(green) and anti CD31 antibody(red). CD31 is expressed on 
endothelial cells and is used as a marker for blood vessels. FATP5 immunoreactivity 
10 was present in large patches which overlap with CD31 positive areas, suggesting that 
FATP5 protein was present in the space of Diss, the area where hepatocytes exchange 
nutrients with the blood. (Figure 110) This implicates FATP5 in the uptake of fatty 
acids into hepatocytes. 

Example 19 Identification and Characterization of Human FATP3 Proteins 

15 Isolation of additional humanFATP3 clones 

An additional clone encoding human FATP3 was identified by searching for 
sequences similar to murine or human FATP3 coding regions usingTEe BlastX 
algorithm in a proprietary database, (Altschul, et al, J. Mol. Bio. 215: 403-410, 1990). 
One clone, which was identified by random library sequencing, is described as 

20 johni003f04 (SEQ ID NO: 1 16) extends the open reading frame of the hsFATP3 
polypeptide sequence by 30 amino acids at the N-tenninus when compared to 
previously discovered sequences. The DNA sequence of this clone is shown in 
Figures 1 1 1 A and 1 1 IB, and the predicted protein sequence (SEQ ID NO: 1 17) is 
shown in Figure 112. The open reading frame of this clone begins at the initial 

25 nucleotide and includes nucleotide 2240. The first ATG is located at nucleotide 
number 51, resulting in a predicted protein which includes 730 amino acids. An 
FATP signature sequence (see Hirsch et al, PNAS, 95:8625-8629, 1998) is clearly 
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present between amino acids 331 and 640 of hsFATP3. Within this signature 
sequence hsFATP3 is 48% identical to hsFATPl at the amino acid level. A 
consensus AMP -binding motif has been identified (amino acid 333-334), Thus, 
hsFATP3 is clearly a member of the fatty acid family. 

5 Functional analysis of FATP3 Clones 

SEQ ID NO: 1 16 is contained in the mammalian expression vector pMET7 
(Tartaglia, et aL, Cell, 83: 1263-1271, 1995). To determine if the protein encoded by 
this DNA sequence can mediate fatty acid uptake, SEQ ID NO: 116 was transfected 
into COS cells. Uptake of a BODIPY-labeled fatty acid was determined as described 

10 in previous experiments (Hirsch, et ah, PNAS, 95: 8625-8629, 1998), Transfection 
with SEQ ID NO: 116 resulted in a dramatic increase in fatty acid uptake when 
compared to transfection with vector control. In this experiment, CD3 1 served as a 
marker for transfected cells. Only CD31 positive cells were considered for analysis 
(see Hirsch, et a!., PNAS, 95: 8625-8629, 1998 for details). The results (Figure 113) 

15 demonstrate that SEQ ID NO: 1 16 encodes a functional fatty acid transport protein. 

Tissue Distribution of human FATP3 

Polyclonal antibodies were raised by immunizing rabbits with GST fused to 
the most C-terminal 89 amino acids of mmFATP3 - 
(RPPQALNLVQLYSHVSENLP^ 

DPSVLSDPLYVLDQDIGAYLPLTPARYSALLSGDLRI) (SEQ ID NO: 120). 
Western blotting experiments with murine tissue lysates using the anti-FATP3 
antiserum closely confirmed the unique expression pattern of FATP3 as judged by 
northern blot experiments. This, together with the fact that the serum reacted only 
weakly with lysates from cell lines expressing either FATP1, -2, -4 or -5, indicates 
that the antibody recognizes preferentially FATP3, but not other FATP family 
members. 

FATP3 protein was detected in mouse liver, spleen, heart, kidney, testis, white 
adipose tissue, and most notably in the lung. Further FATP3 expression in the lung 
was examined by immunofluorescence microscopy, 5 to 10 jiM thick fresh frozen 
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unfixed sections of murine and chimpanzee lungs were blocked with 10% FCS/1% 
donkey serum/1% BSA in HBS and incubated overnight with anti-FATP3 serum in 
blocking solution. After washing the sections Alexa 488 conjugated donkey anti- 
rabbit secondary antibodies were used to detect bound anti-FATP3 primary 
5 antibodies and nuclei were stained TOT03. In later experiments, chimpanzee lung 
was incubated with a mixture of rabbit anti-FATP3 and mouse monoclonal anti- 
CD31 to visualize FATP3 as well as blood vessels. Sections were imaged on a Zeiss 
LSM510 confocal microscope. Experiments carried out once with mouse and three 
times with chimpanzee lung tissue showed that FATP3 is present at high levels in 

10 type-II pneumocytes, a cell type responsible for secretion of surfactant, a 

phospholipid-rich film critical for lung function. The exact function of FAT3 in type 
II pneumocytes is not yet clear. One hypothesis is that FATP3 is responsible for 
supplying fatty acid substrates for the symthesis of surfactant. 

PCR-based experiments showed that the exocrine as well as endocrine 

1 5 pancreas expresses FATP3. This fact was confirmed by immunofluorescence 

performed as described above for the lung sections, on chimpanzee pancreas which 
showed FATP3 localized to the plasma membrane of acinar cells and a punctate 
expression pattern on the plasma membrane and in the cytosol of alpha and beta cells 
of the pancreatic islands. The identification of a fatty acid transporter in the insulin 

20 producing cells of the pancreas has potentially broad implications for the treatment of 
type II diabetes and obesity. In both diseases, fatty acid levels in the blood are 
elevated and, in later stages of the disease, lead to diminished insulin secretion by the 
pancreas due to the induction of apoptosis in insulin-producing beta cells 
(Shimabukuro, et aL, PNAS, 95: 2498-2502, 1988). Blocking fatty acid uptake into 

25 the beta cells could possibly prevent apoptosis and maintain insulin secretion thus 
preventing the progression from obesity to diabetes- 



Example 20 Identification of a fatty acid binding domain in FATP4 

GST fusion proteins were constructed in pGEX for four regions of hsFATP4 
(SEQ ID NO: 52; Figure 51) which were generated by PGR and verified by 
30 sequencing. The first three fusion proteins were constructed from regions near the N- 
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terminal portion of the protein. SPl (SEQ ID NO: 121) contained amino acid residues 
43-239 of the hsFATP4 sequence as shown in Figure 1 14A. This portion of hsFATP4 
contains a lipocalin domain (as shown in Figure 1 17) as well as a number of residues 
which in hsFATP4 are upstream of the lipocalin domain. SP2 (SEQ ID NO: 122) 
5 contained residues 43-290 of the hsFATP4 sequence as shown in Figure 1 14B. This 
portion of the hsFATP4 contains a lipocalin domain and an AMP binding domain as 
well as a number of residues which are upstream of the lipocalin domain. SP3 (SEQ 
ID NO: 123) contained amino acid residues 125-290 of the hsFATP4 sequence as 
shown in Figure 1 14C). This portion of the hsFATP4 contains a lipocalin domain and 

1 0 an AMP binding domain, but does not contain the upstream residues. The fourth 
fusion protein was constructed from a region at the Oterminal end of the hsFATP4 
polypeptide. SP5 contained amino acid residues 41 7-643 of hsFATP4 polypeptide as 
show in Figure 1 14D (SEQ ID NO: 124). 

Proteins were expressed in E. coli and purified on glutathione affinity beads 

15 using standard techniques. To determine fatty acid binding, beads were mixed with 
100 |IM 14C-labeled fatty acids in mixed micelles with taurocholate (lOmM, Sigma) 
and incubated for 30 minutes at room temperature. The beads were subsequently 
washed with PBS containing lOmM taurocholate and radioactivity associated with 
beads was assessed by scintillation counting. A fusion to the C-terminal domain of 

20 hsFATP4 (SP5) did not show any oleate (ARC) binding compared to GST protein 
alone, while 2 N-terminal fusions (SPl and 2) bound significant amounts of oleate. 
(Figure 116). 



FATTY ACID 


SPl 


SP2 


SP3 


SP5 


GST 


Oleate 


25772±1326 


16172±1639 


4206±631 


2413±186 


1511*525 



25 Similar results were obtained using maltose-binding protein fusions. MBP 

fusion constructs were generated by digesting the pGEX-SP constructs with 
EcoRI/XhoI and ligated into pMAL digested with EcoRI/Sall. MBP fusion proteins 
were expressed in E. coli and were purified under non-denaturing conditions following 
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the manufacturer's instructions. To determine fatty acid binding, beads were mixed 
with 100 \xM 14C-labeled fatty acids in mixed micelles with taurocholate (lOmM, 
Sigma) and incubated for 30 minutes at room temperature. The beads were 
subsequently washed with HBS containing lOmM taurocholate. The proteins were 
5 subsequently eluted from the resin with maltose and the amount of fatty acid binding 
to MBP-SP1, -2, -3, and -5 was assessed by determining the radioactivity associated 
with the elute by p-scintillation counting. 

Unlike GST fusion proteins, MBP fusion proteins are not self-dimerizing. 
Further, long-chain fatty acids (such as oleate and palmitate), but not short-chain fatty 

10 acids (such as butyrate), were specifically bound by SP1 (Figure 1 1 7). This selective 
binding is consistent with previous reports of the substrate specificity of FATP4 
(Stahl, et al % Mol. Cell, 4, 299-308, 1999). The identification of a fatty acid binding 
domain in FATP4 will be useful in the development of small molecules that inhibit the 
binding and transport of fatty acids by FATP4 and may provide useful information on 

15 the mechanism of fatty acid transport. 



Results of Fatty Acid Binding 



FATTY ACID 


Composition 


binding to MBP-SP1 


binding to MBP-SP5 


Oleate 


C18H3402 


3968 


2800 


Palmitate 


C16H3202 


4588 


844 


Arachidonate 


C20H4002 


1942 


1147 


Butyrate 


C4H802 


142 


633 



These experiments demonstrate that the FATPs of the present invention 
contain domains that bind various long chain fatty acids. Thus, polypeptides 
containing these domains can be prepared and utilized to assess the modulation of 
25 binding and transport function by a variety of agents. The polypeptides with the 

highest binding capacities were shown to be those containing a lipocalin domain (such 
as those shown in Figure 118) with additional upstream residues, such as those 



BNSDOCID: <WO 0121795A3JA> 



WO 01/21795 



-109- 



PCT/US00/25891 



associated with this domain in the N-terminal portion of hsFATP4. Polypeptides 
containing domains in addition to the lipocalin domain (for example, those containing 
an AMP binding domain) were also shown to bind fatty acids at significant levels. 

Figure 118 contains an alignment depicting the consensus sequences for the six 
5 human FATP, hsFATPl, hsFATP2, hsFATP3, hsFATP4, hsFATPS and hsFATP6 
polypeptides. A lipocalin domain and an AMP binding domain for each polypeptide 
are both identifed and compared. A search using the lipocalin signature sequence 
[DENG]-X-[DENQGSTARK]-X(0,2)-[DENQARK]-[LIVFY]- {CP} -G- {C} -W- 
[FYWLRH-X]-[LIVMTA] conducted on a public database (www.ebi.ac.uk/interpro/ ). 
10 indicated that the lipocalin domains of hsFATPl and hsFATP4 are identical to the 
lipocalin signature sequence. In addition, a search directed to identifying sequences 
having at least 80% identity to the lipocalin signature sequence identified three 
additional human FATPs, hsFATP3, hsFATPS and hsFATP6. 



15 The following is the result of comparing individual hsFATP protein sequences 

with the lipocalin domain identified for hsFATPl and hsFATP4. The comparison was 
made using the BLAST Network Service at the National Center for Biotechnology 
Information. (Capitalized AA agree with the lipocalin signature sequence.) 



FATP6: 1 14 to 125 NEpDFVhVWFGL. 76% similarity (SEQ ID NO: 138) 

20 AATGAGCCGGACTTCGTTCACGTGTGGTTCGGCCTC 

FATP5: 182 to 194 sQAVpaLcMWLGL. 53% similarity (SEQ ID NO: 139) 

TCCCAGGCCGTTCCAGCCCTGTGTATGTGGCTGGGGCTG 

FATP4: 134 to 146 ENRNEFVGLWLGM. Identity (SEQ ID NO: 129) 

GAGAACCGCAATGAGTTCGTGGGCCTATGGCTGGGCATG 
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FATP3 : 22 1 to 234 IPAGPEFLwLWFGL. 69% similarity (SEQ ID NO: 1 40) 
CTCCCCGCTGGCCCAGAGTTTCTGTGGCTCTGGTTCGGGCTG 

FATP2: 1 12 to 124 GNEPAYVwLWLGL. 80% similarity (SEQ ID NO: 127) 
GGTAACGAGCCGGCCTACGTGTGGCTGTGGCTGGGGCTG 

5 FATP 1 : 1 36 to 1 48 EGRPEFVGLWLGL. Identity (SEQ ID NO: 126) 

GAGGGCCGGCCGGAGTTCGTGGGGCTGTGGCTGGGCCTG 

REFERENCES 

Abumrad, N., Coburn, C, and Ibrahimi, A. (1999). Membrane proteins implicated in 
long-chain fatty acid uptake by mammalian cells: CD36, FATP and FABPm. Biochim 
10 Biophys Acta 1441, 4-13. 

Berk, P. D., Bradbury, M., Zhou, S. L., Stump, D., and Han, N. I. (1996). 
Characterization of membrane transport processes: lessons from the study of BSP, 
bilirubin, and fatty acid uptake. Semin Liver Dis 16, 1 07-20. 

Berk, P. D., and Stump, D. D. (1999). Mechanisms of cellular uptake of long chain 
15 free fatty acids. Mol Cell Biochem 17-31. 

Boisclair, Y. R. 3 Brown, A. L., Casola, S. s and Rechler, M. M. (1993). Three clustered 
Spl sites are required for efficient transcription of the TATA-less promoter of the 
gene for insulin-like growth factor-binding protein-2 from the rat. J Biol Chem 268, 
24892-901. 



BNSDOCID: <WO 0121795A3JA> 



WO 01/21795 



-111- 



PCT/US00/25891 



De Simone, V., and Cortese, R. (1992), Transcription factors and liver-specific genes. 
Biochim Biophys Acta 1132, 1 19-26. 

Fitscher, B. A., Elsing, C, Riedel, H. D., Gorski, J., and Stremmel, W. (1996). 
Protein-mediated facilitated uptake processes for fatty acids, bilirubin, and other 
5 amphipathic compounds [see comments]. Proc Soc Exp Biol Med 212, 15-23. 

Frohnert, B. I., Hui, T, Y M and Bernlohr, D. A. (1999). Identification of a functional 
peroxisome proliferator-responsive element in the murine fatty acid transport protein 
gene. J Biol Chem 274, 3970-7. 

Glatz, J. F., Luiken, J. J., van Nieuwenhoven, F. A., and Van der Vusse, G. J. (1997). 
10 Molecular mechanism of cellular uptake and intracellular translocation of fatty acids. 
Prostaglandins Leukot Essent Fatty Acids 57, 3-9. 

Gottlicher, M. s Widmark, E., Li, Q., and Gustafsson, J. A. (1992). Fatty acids activate 
a chimera of the clofibric acid-activated receptor and the glucocorticoid receptor. Proc 
Natl Acad Sci U S A 89, 4653-7. 

15 Grimaldi, P. A., Teboul, L., Gaillard, D. 5 Armengod, A. V., and Amri, E. Z. (1999). 
Long chain fatty acids as modulators of gene transcription in preadipose cells. Mol 
Cell Biochem 792, 63-8. 

Hamilton, J. A. (1998). Fatty acid transport: difficult or easy? J Lipid Res 39, 467-81. 

Hanson, R. W., and Reshef, L. (1997), Regulation of phosphoenolpyruvate 
20 carboxykinase (GTP) gene expression. Annu Rev Biochem 66, 581-61 1 . 



BNSDOCID: <WO, 0121795A3JA> 



WO 01/21795 



PCT/US00/25893 



-112- 

Heinemeyer, T., Chen, X., Karas, H., Kel, A. E., Kel, O. V., Liebich, L, Meinhardt, T., 
Reuter, I., Schacherer, F., and Wingender, E. (1999). Expanding the TRANSFAC 
database towards an expert system of regulatory molecular mechanisms. Nucleic 
Acids Res 27, 318-22. 

5 Heinemeyer, T., Wingender, E., Reuter, I, Hermjakob, H., Kel, A. E., Kel, O. V., 
Ignatieva, E. V., Ananko, E. A., Podkolodnaya, O. A., Kolpakov, F. A., Podkolodny, 
N. L M and Kolchanov, N. A. (1998). Databases on transcriptional regulation: 
TRANSFAC, TRRD and COMPEL. Nucleic Acids Res 26, 362-7. 

Hirsch, D., Stahl, A., and Lodish, H. F. (1998). A family of fatty acid transporters 
10 conserved from mycobacterium to man. Proc Natl Acad Sci U S A 95 9 8625-9. 

Hua, X., Liu, X., Ansari, D. O., and Lodish, H. F. (199S), Synergistic cooperation of 
TFE3 and smad proteins in TGF-beta-induced transcription of the plasminogen 
activator inhibitor- 1 gene. Genes Dev 72, 3084-95. 

Hui, T. Y., Frohnert, B. I., Smith, A. J., Schaffer, J. E„ and Bernlohr, D. A. (1998). 
15 * Characterization of the murine fatty acid transport protein gene and its insulin 
response sequence. J Biol Chem 273, 27420-9. 

Lai, E. (1992). Regulation of hepatic gene expression and development. Semiri Liver 
Dis 72, 246-51. 

Lee, Y. H., Yano, M., Liu, S. Y., Matsunaga, E., Johnson, P. F M and Gonzalez, F. J. 
20 (1994). A novel cis-acting element controlling the rat CYP2D5 gene and requiring 
cooperativity between C/EBP beta and an Spl factor. Mol Cell Biol 14, 1383-94, 



BNSDOCID: <WO 0121795A3_IA> 



WO 01/21795 



PCT/US00/25891 



-113- 

Martin, G., Schoonjans, K., Lefebvre, A. M., Staels, B., and Auwerx, J. (1997). 
Coordinate regulation of the expression of the fatty acid transport protein and acyl- 
Co A synthetase genes by PPARalpha and PPARgamma activators. J Biol Chem 272, 
28210-7. 

5 Memon, R. A. ? Fuller, J., Moser, A. H., Smith, P. J., Grunfeld, C, and Feingold, K. R. 
(1999). Regulation of putative fatty acid transporters and Acyl-CoA synthetase in liver 
and adipose tissue in ob/ob mice. Diabetes 48, 121-7. 

Motojima, K., Passilly, P., Peters, J. M., Gonzalez, F. J., and Latruffe, N. (1998). 
Expression of putative fatty acid transporter genes are regulated by peroxisome 
10 proliferator-activated receptor alpha and gamma activators in a tissue- and inducer- 
specific manner. J Biol Chem 273, 16710-4. 

Rodenburg, R. J. s Holthuizen, P. E., and Sussenbach, J. S. (1997). A functional Spl 
binding site is essential for the activity of the adult liver-specific human insulin-like 
growth factor II promoter. Mol Endocrinol 11, 237-50. 

15 Rongnoparut, P., Verdon, C. P., Gehnrich, S. C, and Sul, H. S. (1991). Isolation and 
characterization of the transcriptionally regulated mouse liver (B-type) 
phosphofructokinase gene and its promoter. J Biol Chem 266, 8086-91. 

Ryu, S., Zhou, S., Ladurner, A. G., and Tjian, R. (1999). The transcriptional cofactor 
complex CRSP is required for activity of the enhancer-binding protein SpL Nature 
20 397, 446-50. 

Schaffer, J. 3 and Lodish, H. F. (1995). Molecular mechanisms of long-chain fatty acid 



BNSDOCID: <WO 0121795A3_IA> 



WO 01/21795 



PCT/US00/25891 



-114- 

uptake. Trends in Cardiovascular Medicine 5, 218-224. 

Schaffer, J. E. 3 and Lodish, H. F. (1994). Expression cloning and characterization of a 
novel adipocyte long chain fatty acid transport protein [see comments]. Cell 79 \ 427- 
36. 

5 Schoonjans, K., Staels, B., and Auwerx, J. (1996). The peroxisome proliferator 
activated receptors (PPARS) and their effects on lipid metabolism and adipocyte 
differentiation. Biochim Biophys Acta 1302, 93-109. 

Sorensen, P. ? and Wintersberger, E. (1999). Spl and NF-Y are necessary and sufficient 
for growth-dependent regulation of the hamster thymidine kinase promoter fin Process 
10 Citation]. J Biol Chem 274, 30943-9. 

Stahl, A., Hirsch, D. J. 3 Gimeno, R. E., Punreddy, S., Ge, P., Watson, N., Patel, S., 
Kotler, ML, Raimondi, A., Tartaglia, L. A„ and Lodish, H. F. (1999). Identification of 
the major intestinal fatty acid transport protein [In Process Citation]. Mol Cell 4 9 299- 
308. 

15 Stremmel, W. (1989). Mechanism of hepatic fatty acid uptake. Journal of Hepatology 
9, 374-382. 

All references cited herein are incorporated by reference in their entirety. 

While this invention has been particularly shown and described with references 
to preferred embodiments thereof, it will be understood by those skilled in the art that 
20 various changes in form and details may be made therein without departing from the 
spirit and scope of the invention as defined by the appended claims. 
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CLAIMS 

What is claimed is: 

1. An isolated nucleic acid comprising the nucleotide sequence of SEQ ID 
5 NO.:116 or its complement. 

2. An isolated nucleic acid comprising the coding sequence of SEQ ID 
NO.: 116. 

3. An isolated nucleic acid which encodes a polypeptide comprising the 
amino acid sequence of SEQ ID NO.:l 17 or its complement. 

10 4. An isolated nucleic acid which hybridizes under stringency conditions 

of 6X SSC at 65° C, followed by at least two washes in 0.2X SSC/0.5% 
SDS at 65° C, to the nucleic acid comprising the nucleotide sequence 
ofSEQIDNO.:116. 

5. An isolated nucleic acid consisting of a nucleotide sequence having at 
15 least 95% identity to a nucleotide sequence of Claim 1. 

6. An isolated nucleic acid consisting of a nucleotide sequence having at 
least 90% identity to a nucleotide sequence of Claim 1 . 

7. An isolated nucleic acid encoding a fusion polypeptide, wherein the 
isolated nucleic acid comprises a nucleotide sequence of SEQ ID 

20 NO.:116. 
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A vector comprising a nucleic acid of Claim 1. 



A vector comprising a nucleic acid of Claim 2. 



A vector comprising a nucleic acid of Claim 3. 



A vector comprising a nucleic acid of Claim 4. 



A vector comprising a nucleic acid of Claim 5. 



A vector comprising a nucleic acid of Claim 6. 



A vector comprising a nucleic acid of Claim 7. 



An isolated host cell transfected with the vector of Claim 8. 



An isolated host cell transfected with the vector of Claim 9. 



An isolated host cell transfected with the vector of Claim 10. 



An isolated host cell transfected with the vector of Claim 11. 



An isolated host cell transfected with the vector of Claim 12. 



WO 01/21795 



PCT/US00/25891 



-117- 



20. An isolated host cell transfected with the vector of Claim 13. 

21. An isolated host cell transfected with the vector of Claim 14. 

22. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 15 under conditions in which the nucleic acid is 

5 expressed, thereby producing the polypeptide. 

23. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 16 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 

24. A method of producing a polypeptide comprising the step of culturing 
1 0 the host cell of Claim 1 7 under conditions in which the nucleic acid is 

expressed, thereby producing the polypeptide. 

25. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 1 8 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 



15 26. 



A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 19 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 



27. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 20 under conditions in which the nucleic acid is 
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expressed, thereby producing the polypeptide. 



28. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 21 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 



5 29. An isolated nucleic acid comprising at least 30 contiguous nucleotides 

of the nucleotide sequence of SEQ ID NO.: 116. 



30. An isolated nucleic acid comprising at least 200 contiguous nucleotides 
of the nucleotide sequence of SEQ ID NO.:116. 



31. An isolated polypeptide comprising the amino acid sequence of SEQ 
10 IDNO.:117. 

32. An isolated naturally occurring allelic variant of a polypeptide 
consisting of the amino acid sequence of Claim 31. 

33. An isolated polypeptide consisting of an amino acid sequence having at 
least 95% identity to the amino acid sequence of Claim 31. 



15 34. An isolated polypeptide consisting of an amino acid sequence having at 

least 90% identity to the amino acid sequence of Claim 31. 



35. An isolated polypeptide encoded by a nucleic acid that hybridizes to a 
nucleic acid consisting of the nucleotide sequence of SEQ ID NO.:l 17 
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under stringency conditions of 6X SSC at 65° C, followed by at least 
two washes in 0.2X SSC/0.5% SDS at 65° C. 



A fusion protein comprising a polypeptide consisting of the amino acid 
sequence of SEQ ID NO.: 1 1 7. 



The fusion protein of Claim 36, wherein the fusion protein transports 
fatty acids across a cell membrane or an artificial cell membrane 
system. 

An isolated polypeptide comprising at least 15 contiguous amino acid 
residues of SEQ ID NO.:l 17. 



An isolated polypeptide comprising at least 50 contiguous amino acid 
residues of SEQ ID NO.:l 17. 

An isolated polypeptide comprising at least 360 contiguous amino acid 
residues of SEQ ID NO. : 1 1 7. 



An isolated polypeptide comprising an amino acid sequence having at 
least 15 contiguous amino acid residues of SEQ ID NO.:l 17, wherein 
the isolated polypeptide transports fatty acids across a cell membrane or 
an artificial cell membrane. 



An isolated polypeptide encoded by a nucleic acid that hybridizes to a 
nucleic acid consisting of the nucleotide sequence of SEQ ID NO.:l 16 
under stringency conditions of 6X SSC at 65° C, followed by at least 
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two washes in 0.2X SSC/0.5% SDS at 65° C. 



43. A method for identifying an agent which binds to a protein comprising 
an amino acid sequence of SEQ ID NO.:l 17 comprising the steps of 
contacting the agent with the isolated protein under conditions 
5 appropriate for binding of the agent to the isolated protein, and 

detecting a resulting agent-protein complex. 



44, An agent identified by the method of Claim 43. 



45. A method for identifying an agent which is an inhibitor of fatty acid 
uptake by a protein encoded by a polynucleotide comprising a 

10 nucleotide sequence which encodes a protein consisting of the amino 

acid sequence of SEQ ID NO.:l 17, comprising the steps of: 

a) maintaining test cells expressing said polynucleotide in the 

presence of a fatty acid and an agent to be tested as an inhibitor 
of fatty acid uptake; 

15 b) measuring uptake of the fatty acid in the test cells; and 

c) comparing uptake of the fatty acid in the test cells with uptake 
of the fatty acid in suitable control cells; 

wherein lower uptake of the fatty acid in the test cells compared to 
uptake of the fatty acid in the control cells is indicative that the agent is 
20 an inhibitor of fatty acid uptake by said protein. 

46. An inhibitor of fatty acid uptake identified by the method of Claim 45. 



47. The method of Claim 45 further comprising the steps of: 
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a) administering the agent to one or more test animals; 

b) measuring exogenously supplied fatty acids in one or more 
samples of tissue or bodily fluid from said test animals; 

c) measuring exogenously supplied fatty acids in one or more 
comparable samples of tissue or bodily fluid from suitable 
control animals; 

d) comparing "the fatty acids of b) with the fatty acids of c); 

whereby, lower fatty acids in step b) than in step c) is indicative that the 
agent is an inhibitor of said protein. 

An inhibitor of fatty acid uptake identified by the method of Claim 47. 

The method of Claim 45, wherein the nucleotide sequence which 
encodes a protein consists of a nucleotide sequence with 95% identity 
to a nucleotide sequence which encodes the polypeptide with SEQ ID 
NO.: 117, 

Amethod for identifying an agent which is an inhibitor of a protein 
encoded by a polynucleotide comprising a nucleotide sequence which 
encodes a protein comprising the amino acid sequence in SEQ ID 
NO.: 117 comprising the steps of: 

(a) introducing into host cells one or more vectors comprising a 
polynucleotide expressing said protein; 

(b) culturing a first aliquot of the host cells with fatty acid substrate 
of said protein and with an agent being tested as an inhibitor of 
said protein; 

(c) culturing a second aliquot of the host cells with fatty acid 
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substrate of said protein; 

(d) measuring, in the first and second aliquots, uptake of the fatty 
acid substrate of the host cells; 

wherein less uptake of the fatty acid substrate in the first aliquot 
5 compared to the second aliquot is indicative that the agent is an 

inhibitor of said protein. 



51. An inhibitor of fatty acid uptake identified by the method of Claim 50. 



52. The method of Claim 50 further comprising the steps of: 

a) administering the agent to one or more test animals; 

10 b) measuring exogenously supplied fatty acids in one or more 

samples of tissue or bodily fluid from suitable control animals; 

c) measuring exogenously supplied fatty acids in one or more 
comparable samples of tissue or bodily fluid from the test 
animals; and 

15 d) comparing the fatty acids of the control animals with the fatty 

acids of the test animals whereby, lower fatty acids in the 
control animals than in the test animals is indicative that the 
agent is an inhibitor of said protein. 



53. A method for identifying an agent which binds to a protein comprising 
20 an amino acid sequence of SEQ ID NO.:l 17 comprising the steps of 

contacting the agent with the isolated protein under conditions 
appropriate for binding of the agent to the isolated protein, and 
detecting a resulting agent-protein complex. 
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A method for identifying an agent which inhibits interaction between 
an isolated protein comprising an amino acid sequence of SEQ ED 
NO.: 11 7, and further comprising a ligand of said protein, comprising: 

(a) combining: 

(1) said isolated protein; 

(2) the ligand of said protein; and 

(3) a candidate agent to be assessed for its ability to inhibit 
interaction between said protein of (1) and the ligand of 
(2), under conditions appropriate for interaction 
between the said protein of (1) and the ligand of (2); 

(b) determining the extent to which said protein of (1) and the 
ligand of (2) interact; and 

(c) comparing the extent determined in (b) with the extent to which 
interaction of said protein of (1) and the ligand of (2) occurs in 
the absence of the candidate agent to be assessed and under the 
same conditions appropriate for interaction of said protein of 
(1) with the ligand of (2); 

wherein if the extent to which interaction of said protein of (1) and the 
ligand of (2) occurs is less in the presence of the candidate agent than ~~ 
in the absence of the candidate agent, the candidate agent is an agent 
which inhibits interaction between said protein and the ligand of said 
protein. 

A method for detecting, in a sample of cells, a nucleic acid molecule 
consisting of a nucleotide sequence with at least 90% sequence identity 
to SEQ ID NO.: 1 16, comprising: 

purifying nucleic acid from the cells; 

hybridizing 1) purified nucleic acid from the cells to 2) purified nucleic 
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acid comprising SEQ ID NO.: 116, under conditions that allow 
hybridization between 1) and 2) if the sequences of 1) and 2) have at 
least 90% sequence identity; and 

c) detecting resulting hybrid nucleic acids in the hybridization; wherein, if 
5 hybrid nucleic acids are detected at a significant level compared to a 

suitable control hybridization, then a nucleic acid molecule comprising 
at least 90% sequence identity to SEQ ID NO: 116, has been detected. 



56. A method for identifying (1) nucleic acid molecules in fixed cells 
which specifically interact with a (2) nucleic acid molecule comprising 

10 the nucleotide sequence in SEQ ID NO.:116, said method comprising 

the steps of: 

a) adding to the fixed cells the nucleic acid molecule comprising a 
nucleotide sequence in SEQ ID NO.:l 16; 

b) incubating the fixed cells under conditions allowing 
1 5 hybridization of (1) with (2); 

c) removing the nucleic acid molecule of step a) that has not 
hybridized; and 

d) detecting hybrid molecules comprising (1) and (2). 

57. A method for detecting FATP3 in a sample of cells, comprising the 
20 steps of adding an agent that specifically binds to FATP3 to the 

sample, and detecting the agent specifically bound to the FATP3. 

58. The method of Claim 57 wherein the agent is an antibody which 
specifically binds to FATP3. 
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A method for detecting FATP3 in a sample of cell lysate, comprising 
the steps of adding an agent that specifically binds to FATP3 to the 
sample, and detecting agent specifically bound to the FATP3 . 

The method of Claim 59 wherein the agent is an antibody which 
specifically binds to FATP3. 

An isolated antibody which binds to a polypeptide having an amino 
acid sequence consisting of at least 95% amino acid sequence identity 
with the amino acid sequence of SEQ ID NO.:l 17. 

An isolated antibody which binds to a fatty acid transport protein 
having the amino acid sequence of SEQ ID NO.:l 17. 

A method for detecting, in a sample of cells, a nucleic acid molecule 
comprising at least 90% sequence identity to SEQ ID NO.:l 16 
comprising: 

a) purifying nucleic acid from the cells; 

b) hybridizing 1) purified nucleic acid from the cells to 2) purified 
nucleic acid comprising SEQ ID NO.:l 16 under conditions that 
allow hybridization between 1) and 2) if the sequences of 1) 
and 2) have at least 90% sequence identity; and 

c) detecting resulting hybrid nucleic acids in the hybridization; 
wherein, if hybrid nucleic acids are detected at a significant 
level compared to a suitable control hybridization, then a 
nucleic acid molecule having at least 90% sequence identity to 
SEQ ID NO.:l 16 has been detected. 
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64. A method for detecting, in a sample of purified nucleic acid, a nucleic 
acid molecule comprising at least 90% sequence identity to SEQ ID 
NO.: 116 comprising: 

a) hybridizing 1) the sample of purified nucleic acid to 2) purified 
5 nucleic acid comprising SEQ ID NO.:l 1 6 under conditions that 

allow hybridization between 1) and 2) if the sequences of 1) 
and 2) have at least 90% sequence identity; and 

b) detecting resulting hybrid nucleic acids in the hybridization; 
wherein, if hybrid nucleic acids are detected at a significant 

10 level compared to a suitable control hybridization, then a 

nucleic acid molecule having at least 90% sequence identity to 
SEQ ID NO.:l 16 has been detected. 



65. A method for detecting FATP3 in a sample of cells, comprising the 
steps of adding an agent that specifically binds to FATP3 to the 
1 5 sample, and detecting agent specifically bound to the FATP3. 



66. The method of Claim 65 wherein the agent is an antibody which binds 
to FATP3. 

67. A vector comprising a FATP regulatory sequence and at least one 
targeting sequence directed to the regulatory region of a nucleic acid 

20 with a nucleotide sequence selected from the group consisting of: 

a) SEQDDNO.:46 

b) SEQIDNO.:48 

c) SEQIDNO.:116 

d) SEQ ID NO.:52 
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e) SEQIDNO.:54and 

f) SEQIDNO.:56 

68. An isolated host cell transfected with a vector of Claim 67. 

69. A method of producing a polypeptide comprising culturing the host 
5 cell of Claim 68 under conditions in which the nucleic acid is 

expressed, thereby producing the polypeptide. 

70. An isolated nucleic acid comprising a nucleotide sequence encoding a 
functional portion or fragment of a FATP polypeptide comprising a 
lipocalin domain. 

10 71. The isolated nucleic acid of Claim 70 further comprising a nucleotide 

sequence encoding upstream amino acid residues. 

72. An isolated nucleic acid comprising a nucleotide sequence encoding a 
portion or fragment of a FATP protein containing a lipocalin domain, 
wherein the nucleotide sequence is selected from the group consisting 
15 of portions or fragments of: 



20 



a) 


SEQ ID NO.:46 


b) 


SEQ ID NO.:48 


c) 


SEQIDNO.:116 


d) 


SEQ ID NO.:52 


e) 


SEQ ID NO.:54 and 


f) 


SEQ ID NO.:56. 
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73. An isolated nucleic acid of Claim 72 further comprising at least about 
• 90 nucleotides of the sequence upstream of the lipocalin domain. 

74. A vector comprising a nucleic acid of Claim 73. 

75. An isolated host cell comprising the vector of Claim 74. 

76. A method of producing a polypeptide comprising the step of culturing 
the host cell of Claim 75 under conditions in which the nucleic acid is 
expressed, thereby producing the polypeptide. 

77. A functional portion or fragment of a FATP polypeptide comprising a 
lipocalin domain. 



10 78. The FATP polypeptide of Claim 77 further comprising upstream amino 

acid residues. 



79, An isolated polypeptide comprising an amino acid sequence containing 
a FATP lipocalin domain, wherein the amino acid sequence is selected 
from the group consisting of portions or fragments of: 



15 


a) 


SEQIDNO.:47; 




b) 


SEQ ID NO.:49; 




c) 


SEQIDNO.:117; 




d) 


SEQIDNO.:53; 




e) 


SEQ ID NO.:55; and 


20 


f) 


SEQ ID NO.:57. 
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80. A functional portion or fragment of a FATP polypeptide comprising an 
amino acid sequence selected from the group consisting of: 



a) 


SEQIDNO.:126; 


b) 


SEQ ID NO.:127; 


c) 


SEQIDNO.:128; 


d) 


SEQ ED NO.: 129; 


e) 


SEQIDNO.:130; 


f) 


SEQIDNO.:131. 



and 



10 



81. A fusion protein comprising a polypeptide consisting of a FATP 
polypeptide containing a lipocalin domain. 



82. The fusion protein of Claim 81 further comprising upstream sequences. 

83. The fusion protein of Claim 82, wherein the upstream sequences 
comprise at least about 30 amino acid residues of an upstream 
sequence. 



15 



20 



84. A fusion protein comprising a polypeptide consisting of a FATP 

polypeptide containing a lipocalin domain, wherein the polypeptide 
consists of an amino acid sequence selected from the group consisting 
of portions or fragments of: 

a) SEQ ID NO.:47; 

b) SEQIDNO.:49; 

c) SEQJDDNO.:117; 

d) SEQ ID NO.:53; 
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e) SEQ ID NO.:55; and 

f) SEQ ED NO.:57. 



85. The fusion protein of Claim 84 further comprising upstream sequences. 



86. A method for identifying an agent which binds to a polypeptide, 
5 wherein the polypeptide comprises a FATP lipocalin domain, 

comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 



87. The agent identified by the method of Claim 86. 



10 88. A method for identifying an agent which binds to a polypeptide, 

wherein the polypeptide comprises a FATP lipocalin domain and about 
30 amino acid residues of an upstream sequence, comprising the steps 
of contacting the agent with the polypeptide under conditions 
appropriate for binding of the agent to the polyp eptid e, and detecting a 

1 5 resulting agent-polypeptide complex. 



89. The agent identified by the method of Claim 88. 



90. A method for identifying an agent which binds to a polypeptide, 
wherein the polypeptide comprises a FATP lipocalin domain and 
consists of an amino acid sequence selected from the group consisting 
20 of portions or fragments of: 

a) SEQEDNO.:47; 
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b) SEQ ID NO.:49; 

c) SEQIDNO.:117; 

d) SEQEDNO.:53; 

e) SEQIDNO.:55;and 

f) SEQIDNO.:57, 

comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 

An agent identified by the method of Claim 90. 

A method for identifying an agent which binds to a polypeptide, 
wherein the polypeptide comprises an amino acid sequence selected 
from the group consisting of: 



a) 


SEQ ID NO.:126; 


b) 


SEQ ID NO.:127; 


c) 


SEQIDNO.:128; 


d) 


SEQIDNO.:129; 


e) 


SEQIDNO.:130; 


f) 


SEQIDNO.:131, 



comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 



An agent identified by the method of Claim 92. 
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94. A method for identifying an agent which binds to a polypeptide 
comprising a FATP lipocalin domain, wherein the polypeptide is 
encoded by a nucleotide sequence consisting of portions or fragments 





of: 




5 


a) 


SEQ ID NO.:46; 




b) 


SEQ ID NO.:48; 




c) 


SEQ ID NO.: 11 6; 




d) 


SEQ ID NO.:52; 




e) 


SEQ ID NO.:54; and 


10 


f) 


SEQ ID NO.:56. 



comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 

95. An agent identified by the method of Claim 94. 

15 96. A method for identifying an agent which binds to a polypeptide 

comprising a FATP lipocalin domain and upstream sequences, wherein 
the polypeptide is encoded by a nucleotide sequence consisting of 
portions or fragments of: 

a. SEQIDNO.:46; 

20 b. SEQ ID NO.:48; 

c. SEQIDNO.:116; 

d. SEQEDNO.:52; 

e. SEQ ID NO. :54; and 

f. SEQ ID NO.:56. 
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comprising the steps of contacting the agent with the polypeptide under 
conditions appropriate for binding of the agent to the polypeptide, and 
detecting a resulting agent-polypeptide complex. 



97. An agent identified by the method of Claim 96. 

5 98. An isolated nucleic acid sequence comprising the nucleic acid 

sequence of SEQ ID NO: 113. 

99. The portion of the isolated nucleic acid sequence of Claim 98 which 
encodes a FATP regulatory protein. 

100. The portion of the isolated nucleic acid sequence of Claim 98 which 
10 encodes a FATP5 promoter. 

101. A method of identifying an agent which alters the level of expression 
of the nucleic acid encoding an FATP protein comprising: 

determining abase level of expression of the nucleic acid encoding the 
FATP protein; 

15 (b) contacting an agent with an isolated nucleic acid containing the 

coding region of the FATP protein under functional control of 
its promoter under conditions suitable for binding of the agent 
to the promoter; 

(c) maintaining agent-promoter binding during expression of the 
20 FATP protein; and 

(d) comparing the level of expression of the agent bound promoter 
to that of the baseline level of expression, 
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whereby, if the level of expression of the agent bound promoter is 
significantly different from that of the baseline level of expression, 
then an agent which alters the level of expression of the nucleic acid 
encoding the FATP protein has been identified. 



102. The method of Claim 101, wherein the FATP protein is FATP2. 

1 03. The method of Claim 1 02, wherein the FATP2 is encoded by a nucleic 
acid comprising the nucleotide sequence of SEQ ID NO: 48. 



104. The method of Claim 102, wherein the FATP2 comprises the amino 
acid sequence of SEQ ID NO: 49. 

10 105. The method of Claim 102, wherein expression is inhibited. 



1 06. The method of Claim 1 02, wherein expression is promoted. 



107. The method of Claim 101, wherein the FATP protein is FATP5. 



108. The method of Claim 107, wherein the FATP5 is encoded by a nucleic 
acid comprising the nucleotide sequence of SEQ ID NO: 54. 



15 1 09. The method of Claim 1 07, wherein the FATP5 comprises the amino 

acid sequence of SEQ ID NO: 55. 

110. The method of Claim 1 07, wherein expression is inhibited. 
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111, The method of Claim 1 07, wherein expression is promoted. 

112. A method for directing an agent to liver cells in a mammal, comprising 
administering to the mammal a complex which comprises the agent 
and a moiety which binds to FATP2. 



113. The method of Claim 112, wherein the agent alters fatty acid uptake in 
liver cells. 



1 14. The method of Claim 112, wherein the agent alters the level of fatty 
acids in bile. 



115. A method for directing an agent to the gall bladder in a mammal, 
10 comprising administering to the mammal a complex which comprises 

the agent and a moiety which binds to FATP2. 



116. The method of Claim 115, wherein the agent alters the level of fatty 
* acids in bile. 



117. A method for directing an agent to the liver in a mammal, comprising 
1 5 administering to the mammal a complex which comprises the 

substance and a moiety which binds to FATP5. 



118. The method of Claim 117, wherein the agent alters the uptake of fatty 
acids in liver cells. 
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1 19. The use of an isolated nucleic acid comprising the nucleotide sequence 
of SEQ ID NO.:l 16 or its complement in the manufacture of a 
medicament. 



120. The use of an isolated polypeptide comprising the amino acid sequence 
of SEQ ID NO.:l 17 in the manufacture of a medicament 

121. The use of an agent which is an inhibitor of fatty acid uptake of a 
protein with the amino acid sequence of SEQ ID NO.:l 17 in the 
manufacture of a medicament. 



122. The use of an isolated nucleic acid comprising a nucleotide sequence 
1 0 encoding a portion or fragment of a FATP protein containing a 

lipocalin domain in the manufacture of a medicament, wherein the 
nucleotide sequence is selected from the group consisting of portions 
or fragments of: 



15 



a) 


SEQ ID NO.. 46 


b) 


SEQIDNO.:48 


c) 


SEQ ID NO.: 116 


d) 


SEQIDNO.:52 


e) 


SEQE)NO.:54and 


f) 


SEQ ID NO.:56. 



20 123. The use of an isolated polypeptide comprising an amino acid sequence 

containing a FATP lipocalin domain in the manufacture of a 
medicament, wherein the amino acid sequence is selected from the 
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group consisting of portions or fragments of: 

a) SEQIDNO.:47; 

b) SEQIDNO.:49; 

c) SEQIDNO.:117; 
5 d) SEQ ID NO.:53; 

e) SEQ ID NO. :55; and 

f) SEQ ID NO.:57. 

1 24. The use of an isolated polypeptide in the manufacture of a medicament, 
the polypeptide comprising an amino acid sequence selected from the 
10 group consisting of: 



1. 


SEQ ID NO.: 126; 


2. 


SEQ ID NO.: 127; 


3. 


SEQ ID NO.: 128; 


4. 


SEQ ID NO.: 129; 


5. 


SEQ ID NO.: 130; 


6. 


SEQIDNO.:131. 



125. The use of an isolated polypeptide in the manufacture of a medicament 
for treating obesity, the polypeptide comprising an amino acid 
sequence selected from the group, consisting of: 



20 


7. 


SEQ ID NO.: 126; 




8. 


SEQIDNO.:127; 




9. 


SEQIDNO.:128; 




10. 


SEQIDNO.:129; 
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11. SEQ ID NO.: 130; and 

12. SEQIDNO.:131. 
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mmFATP3 DNA 'sequence 

J^JJ^JJIC^n^^ 40 

GOZl^SCTESSG^^ 80 

G0CC5£O30&J&£Cn^^ 120 

G333ICia30C^lX?IG33^ 160 

CITCCIC£IIQ^13333333^^ 200 

GZIXS£30333>J3Z3^J^^ 240 

GCGC£GG33G2IG^J3^ 280 

G33CSGC?J2IC^33^J3G^ 320 



CajDaQG3GQ0G*J333^^ 400 




IGG 360 



FIG. 8A 
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GSfficrcCTnoGK^^ . — - 

CCCIC^IGCSOG^^ 520 

CXrlGCIGGGC^J^ 560 

CIQCXZGGCHnr^^ 600 

CgG3G0CIC^JOa2a^^ 640 

■i^ia3^I3Cm^J32^^ 680 

gSOCICICTEa^^ 720 

•jQ^ICaGICK^^ 800 

a^j3^JICT3IGIG^^ 840 

G^jTIGIGQ32IGCTIGGQ^ 920 

T^jgC&CSa&K&C^ 1000 

ilGXI^I^^^ 1040 

rpr^jx^jl^iSS^ 1080 rlo, O D 

GCTC^J3iX5C<^3^^ 1120 

TCIQCaeflSCr^^ 1160 

Giiixiisasn^^ 1200 

G3a3 ^ Gcr]X ^^ 1240 

^jvrGcn^j3G3CT^^ 1320 

Q^XOSCIGG^^ 1360 

CUIGGGC^TOZEGSGG^^ 1400 

CIQOI^GC^rGICITCa^ 1440 

jviMISGG^^^ I 480 

TC?O7IUIA03^D03I^ 1520 

GS^j^ilCSXSGCI^^ 1560 

j^£H2iG^i3?iaciE?^^ 1600 

CSOSGIGaSGGG^^ 1640 

rpjogcjicnxgosscaj^ 1680 

aj3^333TTIUI^^ 1760 

GCiTSggQJOcaGP^^ l 84 ^ 

GC^3^JO^I^3GG^ 1880 

^IDZTIOC^iZ!^ I960 

C^gSGpTTSciacSGG^^ 2000 

*IB^i3IC^IT?^ 2040 

a zi£ an a aa 2 aj^ajy^A^A^ 2080 

A?J^A-A 2087 
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nmF2I?3 protein sequence 

AaEEESSESGCHaW^^ 50 

J&m^J07IGG<RGS3^ 100 

M T J pajSOTDftlfr^ 150 
afWE T arer .-piPnr ,p at rem? pt wAmrOTirtKTi^rzrsK rr r ,qT^ mv7rra^7P 200 

GVIE^gsni^DICpv^^ 250 

DtflMX^YEMSC^^ 300 

VEQXrJC3ILaEI32N!^^ 350 

ii£n?xc2diK^^mc^ 400 

J$feQ333i!nnSEGB^ 450 

I77FmK23XT33BQp3Ii3E^^ 500 

QEtfimGTIVEGHS^ja^ 550 

BI^RLQESL^IETE^^ 600 
i^S?XIS3JLEI 613 



rm£^JI!P4 ENS. sequence 

CCTj3Ga3ia2300C2^^ 40 

GGIGG?I33CG3C?ICM 80 

000013030^^^ 120 

TCKILTnUSG^ 160 

CC£!IGCT^J3CHiI^ 200 

G^iTOCnX33^J3C03^ 240 

C3STC^J30C^2AM^ 320 r I G . 1 0 A 

J£rn^J3^!TJGG^^ 360 — 

TICIDGGICTJ3J3C?^ 400 

CT?£in£ID^^ 440 

csoi3CHimrcicn*i^^ 480 

GGC^jriasc&siG^^ 520 

OTCSacaaGITC^^ 560 

AlQ^^i^OrJC^ 600 

'iciocnszmj^iocT^ 640 

GTCiaSGCaj^GC^^ .680 

CGGCMTOJICIGC^lX^ 720 
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•KgaGCnSTTK^ 1000 

C&^aj^TTCC^ 1040 

GCOT£!CIC&Z^^ 1080 

^£TCaQG090Cf^^ 1200 

jiJESISriSO^^ 1240 

GQOI3Cmi7lGC£^^ 1280 

(xogogqc^k^^ 1360 FI G . 1 0 B 

CS^G3SCCITCS^^^ 1400 

(^GGCTTI^JXC^I^^ 1440 

TTSATITXraX^ 1560 

<^ETC^i3CXCTBGa^^ 1600 

£TCO^lX2iL*J3^ 1640 

^GIGZn^JICICCO^ 1680 

CIC<DCECITCCCO^J^^ 1720 

GCCCC&SGGITICI^ 1800 

GGC3Xr*G0GQGCX3^^ 184 ° ■ 

TOX3TC^JIC^3^ 1880 

G^GCC^JSICPSCC^ 1560 

i^CIQOC?i?^IG^ 2000 

Caj^33Tfia^il!CaC3^^ 2080 

TClGiaCTIDCIGaZIQICI^^ 2160 

caacisiaasGa^^ 2200 

Sf^GEGnsacnr^^ 2240 

TCIT^i^^^ 2280 
£AAiLAAA£J^i^^^ 2301 
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protein sequence 



jXX£32J%z^ 200 

Q^vjjusQEf 1 !^^ 240 pjp -j 1 

TjtfR«NEn^^ 320 

HPH^^m^^NKEO^ICVEK^ 360 

GVE^GIH^J^A^ 440 

^TPijRErmHKnBiKCT^ 480 

IH£?KGC^KLIX2^^ 507 
irn j E n ? irpp5 n$k sequence 

cacrnscae*^ 40 

aOCTIBCIGCIGa^^ 120 

^iXCCCCATO333i:<^ 160 

a^AAGTTcs^GG^sa:^ 400 

C^TCIGG^I^^ 440 FIG. 12A 

GaXX3GG3X3GOT^ 480 

C^aGiTOGS^m^ 680 

QG&GCXSO^ 720 

GG^J^iOTCCTiaX^iXTO^ 800 
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****1P5 protein seaaeS 
KRHEEET^^ 120 

sv&guxu^^ 200 

J»aBSDEVEaS^^ 2 - 

HERWEWE2M^ 32Q 

lawraQEaacnn^ FIG. 13 

KSSQaEg^VM^^ 5 20 
hsEATP2 im sequence 

oasmamai^^ 2 

GJEOIEaCTBC^ 2 g 0 

320 

imo^OTA^i^^ KfG, 14A 
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OBenaoratasaM^^^^^B, 10O0 
<* MeBM *^^ 1040 

a«w*ra«^^B» 1120 



^^SS^ FIG. 14 

-™*™*aAaaa*J^J«^ 1622 



mpyi|ii^^MJHS^J.J_C-3_i^ 

j^jCw^^^^^^^ 1622 
l3sBSffiE2 protein sequence 



S=5S^^SSSSS fig. is 

waiOTOOTJ^^^ 280 
G3ZEEF 286 



^^^3 xm sequence 

SSSSSSSSSffiS. FIG . i 
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imE&EP5 protein sequence 



H^ENLE^^ 240 

BERraQySBTCESBaEE^^ 320 

360 pig. 13 

TjgqVEEQEEaaHrra^^ 400 

^T^STEST^^^ 440 

ro: n3i*;E3^^ 480 

^GggaESETC^SW^^ 520 — 

reD3:n ^ RWK ^^ 560 

aj2IFEJSDX!H77Y^L^I 663 
h£EK3S2 INk sequence 

^TGGC^iriG^^^ 40 

QG^T^ilGIC^^ 240 

GTTJCEGCSOSOT^^ 280 

TKMnflsanssra^^ FIG. 14A 

aCXM^AO^il^^ 440 

i^GITCOTA^GIG^^^^ 720 
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z^GICSQCX^32^ PIG 16A 
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GGaxaacaGaoGiCT^^ 240 

Gir^ooiciaiGC^^ 280 

G3QCTGC^ID3GCAj3a3^^ 320 

TlTCQ¥XTimCTi3CT^ 360 • 

TIGO^iXTI3^IQC]aGGQa 400 

AQTCTITOGCC^I^^ 440 FTP 1fiR 

TOQGKIGGC^^^ 4 8Q r±K3t U 

G^XCMTJO^C^^ 520 

«K3(XCCICACWOX^^ 560 -. 

ii^i3^J3CIGIQ^^ 640 

TOGGCTGIC203^ 680 

(^rnrLrn^j^ 720 

TCiG£j3Cn2AC£A^^ 753 



hsF^iTP3 protein sequence 

QfcT£Iie3IVWEHD2V9^ 40 
nDQ3PIi?EHEia03IESWHS3!^^ FIG. 17 

TJTftfVQ^^ 120 

DEl^TVIIDQSSO 1 ^ 191 
hsE£JIP4 nssi sequence 

TC^J3IS^JOX3Ca^ 40 

(XXXaGGIQGCTaG^^ 2 '00 

TJnn&E2&33X2^ 280 FIG. 18A 

TIG3I^iD3IGIC?^J3^ 320 

G3C!C£G^£!QGQG^ 360 

GQGCO*GCT33IGGGro^ 400 

OGoaGcria^iGGC^^ 440 

CHiajXTIiajZLGSi^ 520 
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T^J3Z03I^J2riaOC^^ 560 

20^J3CXDGCX3GC^^ 640 pjQ 18B 

GQIt3iaaGGIC£C?^ 680 

CIGCT3IQGCX^3a^ 720 

734 

hsEATP4 pa?otein sequence 

IGEID3^r2^ 40 

YPIELVl^NEDIMEIJS^ FIG. 19 

KDH^EEX232&^ 160 
7JElj3ZLj"2EEEKIC2IE^^ 200 
V2B7Sl^EVH33H3 213 



hsFKFP5 DMA sequence 

asnxsacnXJTIGa^^ 40 

TCCTDGGGIGCT^i^D^ 80 

C£GCA!IGGa3It^ 160 

CSOO^iSGiaaGaCTQ^ 240 

GOX^IGIGIGQQ^^ 280 

ATTI03^rcnM33C^^^ 320 

GC^JO^TOSGSCTI^^ 360 

CX3GGGSa?IG2<GGC£JL^ 400 

TO3P3IO0OXn^DS^CI^^ 440 

GGa^J3aiEGIt^J33^I^^ 480 

Graj33GC^i333SGa^^ 520 

^J=^2C3GC^J3DC^ 560 

GCIGroQCSaaQ^JiGC^^ 600 

GGGgV . -^CT ^^r^^ 640 

*J332C^imc^rn^ 680 

(^□CTTOOGMGG^OSQ^ 720 

GIGCaGaGO aiUil^^ 760 

TT^JDGIGTJailGG^^ 800 

GSCG3OTliX3GCT3C!IGIGG^ "840 



FIG. 20A 
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Tia^J303QG£I^JO7IGI^^ 880 

agCX^EQS^GGTJ^ 960 

j^xetcigiti^ 1040 

GXXXni^£I33C^J^Aa^^ 1080 pjp 9f|R 

XXlLG^i33CTCnX^^ 1120 • ^ UD 

ccxzriTX^ircTCG^ 1200 

jOXia^DCIDC^]^^ 1240 

<j2^&gj333C2^ 1278 

hsE2EP5 protein sequence 

EGCH^IOTJLD^^ 4=0 

BELSEE^VEISrNTRC^^ 120 I" 1 0 . Z 1 

(gevK^jS^LAEQQII^ 159 
hsEA!TP6 sequence 

O X J llUiUU^l^A A^ 40 

togsgicsoxsc^js^^^ so 

jvT^jriGC^j^JZLT^ 120 

GAG^i^SS^^-^iS^ 160 

GgyumgGCaaaaaGAG^ 200 

j^i3^ITIGG£i^ 240 

TAJ3DC^J£?C^J^ 280 

OX^IXO^I'S^J^S^ 400 

520 

QpjTj^Q-txrpaj^ 560 

o^in^A^c^s^^^^ 640 

QaTCWajTI^^ 680 

TCTAIIGSIGIGGCn^ 720' 
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TIG^J&J&ZFFI^^^ 800 

CTBSXICTX^ 840 

O^QO^J^J3^J^ 880 

GXQ&2&nQ&IIT122^^ 920 

Tn^cnx^^j^^^ 960 

^j3^J33C^JTITm^ 1000 

AAft£?ITIRftI^ 1040 pJQ 22 B 

CT£J3^jy3y3IG£J^^^ 1080 

Aaxj3Q&AAJ33s^GCT3 i i^ 1120 

TirociTMaMcsyais^ 11 60 

^ETI^JG?ITCITI^^ 1200 

CITTI^X^TT^ 1240 

TIOZEn^JflH^A^ITIOT 1280 

GCGSFITG^i^AiltA^ 1320 

AAAkJITICI2&AIT^ 1360 
A 1361 



hsEAJIP6 protein sequence 

££^ZLiKKKFSA£^^ 50 

]^j:G^niBSD\MRE^^ 100 

NT -"PV^ ,T W I 'h-)-«- n K V i ^nKTTRHMRNn*^^ 7 TSH VKfyKWPF 150 

H3EO?YKHIKI2CriJCr^^ FIG. 23 

EWft^IvATIEVA^ 250 
SXfGEKVXEQVVTEX^^ 300 
LKISEPLYEMS^OlCSi^TiL^ 335 



mfcE3£EP n®. sequence. 

' 0^i3ia^I[^O3IC^GC^^ 40 

TiaZKSQSZTGSIO^ -80 

0^Aiia^333OT?O3IG^ 120 

C^J2ACaAi33GIt^i332^ 160 

OGCXSGBSnGQaQS^^ 200 

A^iI^iD0333CT3CT33CXX^ 240 

A2O3GC£j0GGI£^ 280 

AC£!3£TCrTO^ 320 

aCEO^JDSCT^JDGOC^^ 360 
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y&3X5n£a3I2&SJ^ 440 

■ Q3X£03GIC^J3I^ 480 

>j*a&a3!O330B3^ 520 

<IGC!IQ303a^A3GQ^^ 560 

C&333ZCGIDlCO^n^ 600 

GCGGGCOmEGC^JX^ 640 

CC2£2i£QGCX3aC^^ 680 

GSCGCMGCXaafiG^^ 720 

GOTjCTJ333^iI^ 760 

200331X3X300333013^^ 800 

GZEGCGOTl^^^ 840 

CD3?IOTiX£J2AJO^^^ 880 

GTTIT^333m3a33TTCIO^ 960 

03300^3330310^^ 1000 

iTCDsar^JX^^^ 1040 

O^i3GIG03GGIG£J^^ 1080 

^TCIGQ^IC^iSri^^ 1120 

TCIG03£i3TICI2^^^ 1160 

^anm^JIC^^ FIG. 24 B 

CX33Mmir3 , IGGCTI^ 1240 

Go^ircas^iGcm^rGa^^ 1280 

£j3332?j033I^Jm 1320 

O33CT33i3333ITC33^J03^ 1360 

GO^aaaiSftGITOC^ 1400 

CEGITCGITCai^^ 1440 

ATOGDCCAIGCO^ 1480 

anDGC?n^i03Qa^j^j^ 1520 

X2333C2C3X333ZIXXX3££^ 1560 

GjTJQ^OGQCXSr^ 1600 

G^IO30330^rC^ 1640 

aSGCXSGGaGCTOGCDa^ 1680 

QGCna!IGC^J2TiaC^^ 1720 

!IG3Q3C3^iDC^J3^ 1760 

gcec^jdc^sgocikb^^ 1800 

^JD3a^ii33aDGGC^^ 1840 

*iX<XG^:raj33^ 1880 

G2^3332I^j3a33^L^ 1520 

CTG^jCXSCTOSyaGSE^ 1560 

TICAJ^AQOGCQGGCCnX^^ 2000 
QG3TI[?iG 2007 
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mtEZfflP protein sequence 

insqyyggahtt^lidla^ 40 
cJ^nnskasigtvfqfh^aj^yJ^ 80 
tanryaavlaacy sjy^dvvgmlii^st^lm^t^c 120 
gaiagmto^ra^ 160 
cgaszqgrvagd^ltvedver^ 200 
taxy±ftsgttxfi3&2s\^^ 240 
sdfclysclplyh^^ 280 
fwdsvianratsr^g^ FIG. 25 

cgr^lrpeiv^fttirfg^rvr^f^^^ <^pgn.<^ -Ft n i fn 3 go 

^ri^gvspaplai^e?/dld^ 400 

glllsrvTnrlqpfc^^ 440 

gdumspcspagbaa^ 480 

dgtvesct^gv^prtggragmaitlmga^ 520 

ro^-ghlpgyalplfvrvv^la^^ 560 

gadie^lyvlagpd^yvpyyae^peevslgxjL u uy 557 
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Met uptake (cpm/min) 
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ATGCTGCTTGGAGCCTCTCTGGTGGGGGCGCTACTGTTCTCCAAGC 
TAGTGCTGAAGCTGCCCTGGACCCAGGTGGGATTCTCCCTGTTGCT 
C C TGTACTTGGGGT CT GGTGGCTGGCGTTT C ATC CGGGT CTT CAT C 
AAGACGGTCAGGAGAGATAjTCTTTGGTGGCATGGTGCTCCTGAAGG 
TGAAGA.CCAAGGTGCGACGGTACCTTCAGGAGCGGAAGACGGTGCC 
C CTGC TGTTTGC TT C AA.TGGT AC AGCGC C AC C CGGACAA.GA.C A.GC C 
CTGA.TTTTCGAGGGCACAGACACTCACTGGACCTTCCGCCAGCTGG 
ATGAGTACTCCAGTAGTGTGGCCAACTTCCTGCAGGCCCGGGGCCT 
GGCCTCA-GGCAATGTA-GTTGCCCTCTTTATGGAJLAACCGCA_ATGAG 
TTTGTGGGTCTGTGGCTAGGCATGGCCAAGCTGGGCGTGGA.GGCGG 
CTCTCA.TCAACACCAACCTTAGGCGGGA.TGCCCTGCGCCACTGTCT 
TGACACCTCAAAGGCACGAjGCTCTCATCTTTGGCAGTGAjGAjTGGCC 
TCAGCTATCTGTGAGATCCATGCTAGCCTGGAGCCCACA.CTCAGC.C 
T CTTCTGCT CTGGA.T C CTGGGAGC C CAGCA.C AGTGC C C GT CAGCAC 
AGA.GCATCTGGAC C CT CTTCTGGAAGATGC C C CGAAGC AC CTGCC C 
A.GTCACCCAGA.CAAGGGTTTTACAGATAAGCTCTTCTACATCTACA 
CA-TCGGGCACCACGGGGCTACCCAAAGCTGCCA.TTGTGGTGCACAG 
CAGGTATTATCGTATGGCTTCCCTGGTGTACTATGGATTCCGCATG 
CGGCCTGA.TGACA.TTGTCTATGACTGCCTCCCCCTCTACCA.CTCAA 
GCA.GGAAA.CA.TCGTGGGGA.TTGGCAGTGCTTACTCCACGGCA.TGAC 
TGTGGTGAT CCGGAAGAAGTT CT CAGC CTC C CGGTTCTGGGATGAT 
TGTATCAAGTACAA.CTGCACAGTGGTACAGTACATTGGCGAGCTCT 
GCCGCTACCTCCTGAACCAJGCCACCCCGTGAGGCTGAGTCTCGGCA 
CAA.GGTGCGCA.TGGCAjCTGGGCAACGGTCTCCGGCA.GT c c at ctgg 
A.CCGA.CTTCTCCAjGCCGTTTCCACATCCCCCAGGTGGCTGAjGTTCT 
AJTGGGGCCACTGAAjTGCAA.CTGTAGCCTGGGCAA.CTTTGA-CAGCCG 

ggtgggggcctgtggcttcaatagccgcatcctgtcctttgtgtac 

C CT AT CCGTTTGGT ACGTGT CAATGAGGATAC C ATGGAACTGATC C 
GGGGA_CCCGATGGA.GTCTGCATTCCCTGTCAACCAGGTCA_GCCAjGG 
CCAGCTGGTGGGTCGCAjTCA.TCCAGCAGGA.CCCTCTGCGCCGTTTC 
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GACGGGTACCTCAACCAGGGTGCCAACAACAJ\GAAGATTGCTAATG 
ATGTCTTCAAGAAGGGGGACCAAGCCTACCTCACTGGTGACGTCCT 
GGTGATGGATGAGCTGGGTTACCTGTACTTCCGAGATCGCACTGGG 
GACACGTTCCGCTGGAAAGGGGAGAATGTA.TCTACCACTGAGGTGG 
AGGGCA.CACTCAGCCGCCTGCTTCATATGGCAGATGTGGCAGTTTA 
TGGTGTTGAGGTGCCAGGAACTGAAGGCCGAGCAGGAATGGCTGCC 
GTTGCAAGTC C CA.TCAGCAA.CTGTGACCTGGA.GAGCTTTGCA.CA.GA 
CCTTGAAAAAGGAGCTGCCTCTGTATGCCCGCCCCATCTTCCTGCG 
CTTCTTGCCTGAGCTCCACAAGACAGGGACCTTCA-AGTTCCA.GAAG 
.ACAGAGTTGCGGAjlGGAGGGCTTTGA.CCCA.TCTGTTGTGAAA.GA.CC 
CGCTGTTCTATCTGGATGCTCGGAAGGGCTGCTACGTTGCA.CTGGA 
CCAGGAGGCCTATA.CCCGCATCCAGGCAGGCGAGGAGAAGCTGTGA 
TTTCCCCCTACATCCCTCTGAGGGCCAGAAGATGCTGGATTCAGAG 
C CCTAGCGTC C ACC C CAGAGGGT C CTGGGC.AATGC CAGAC C AAAGC 
TAGCAGGGCCCGCACCTCCGCCCCTAGGTGCTGATCTCCCCTCTCC 
C AAACTGCCAA.GTGACTC ACTGC CGCTT CC CCGAC C CT CC AGA.GGC 
TTT CTGTGAAAGT C T CAT C CAAGCTGTGT CTT CTGGT C C AGGC GT G 
GCC C CTGGCC C CAGGGTTT CTGATAGGCT C CTTTAGGATGGT A.T CT 
TGGGTCCAGCGGGCCAGGGTGTGGGAGAGGAGTCACTAAGA.TCCCT 
CCAATCAGAAGGG.AGCTTACAAASG.AA.CC^ 

CTCA.GGAAG CT AA GTGGC C AGAGACTATAGTGGCC AGT CAT C C CAT 
GTC CAC AGAGGAT CTTGGT C CAGAGCTGC CAAAGTGT CAC CT CT C C 
CTGCCTGCACCTCTGGGGAAAAGAGGACA.GCATGTGGCCACTGGGC 
ACCTGTCTCAAGAAGTCAGGATCACACACTCAGTCCTTGTTTCTCC 
AGGTTC CC TTGTT C TTGT CT CGGGGAGGGAGGGACGAGTGTC CTGT 
CTGTCCTTCCTGCCTGTCTGTGAGTCTGTGTTGCTTCTCCA.TCTGT 
CCT.AGCCTGA.GTGTGGGTGGAACAGGCATGAGGAGAGTGTGGCTCA. 
GGGGCCAAT A.AACT CTGC CTTGACTCCTCTTAAAJ^AJLAA^ 
ZJ^J^J^J^^ AAJLAAAAAA AA AJ^AAAAJLAAAAA 
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MLLGAS LVGALLFS KLVLKL PWTQVGFS LLLLYLGS GGWRF IRVF I 
KTVRRD IFGGMVLLKVKTKVRRYLQERKTVPLLFASlVrV'QRHPDKTA 
LI FEGTDTHWTFRQLDEYS S SVANFLQARGLAS GNVVALFMENRNE 
FVGLWLGMAKLGVEAAL INTNLRRDALRHCLDTS KARAL I FGS EMA 
SAICEIHASLEPTLSLFCSGSWEPSTVPVSTEHLDPLLEDAPKHLP 
SHPDKGFTDKLFYIYTSGTTGLPKAAIVVHSRYYRMASLVYYGFRM 
RPDDIVYDCLPLYKSSRKHRGDWQCLLHGMTWIRKKFSASRFWDD 
C I KYNCTWQ Y I GELCR YLLNQP PREAE-SRHKVRMALGNGLRQS I W 
TDFSS RFHI P Q VAE F YGAT E CNC S LGNFD S RVGAC GFNS R I L S F VY 
PIRLVRVNEDTMELIRGPDGVCIPCQPGQPGQLVGRIIQQDPLRRF 
DGYLNQGANNKKIANDVFKKGDQAYLTGDVLVMDELGYLYFRDRTG 
DTFRWKGENVS TTEVE GTLS RLLHMAD VAVYGVE VPGTE GRAGMAA 
VASPISNCDLESFAQTLKKELPLYARPIFLRFLPELHKTGTFKFQK 
TELRKEGFD PSWKD P LFYLDARKGC YVALDQEAYTR I QAGEEKL 
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10 20 30 40 

■ ' ■ ' ' ' ' ' ■ l < 1 ■ ■ i ' ■ ■ ■ i ■ ' ■ ' * 1 1 ■ 1 * ■ ■ ■ ■ i » ■ ' t i 

TCGACCCACGGCGTCCGGGACCCCAAAGCAGAAGCCCGCA 40 
CAGTAGGCACAGCGCACCCAAGAAGGGTCCAGGAGTCTGC 80 
AGAAACAGAAAGGTCCCCGGCCTCAGCCTCCTAGTCCCTG 120 
CCTGCCTCCTGCCTGAGCTTCTGGGAGACTGAAGGCACGG ' 160 
CTTGCAGCTTCAGGATGCGGGCTCCGGGTGCGGGCGCGGC 200 

210 220 230 240 

I I T I I I I I I I « 1 1 f I I 1 t I I 1 T I I I ! I f I t I t I I I t I I I I 

CTCGGTGGTCTCGCTGGCGCTGTTGTGGCTGCTGGGGGTG 240 
CCGTGGACCTGGAGCGCGGCAGCGGCGCTCGGCGTGTACG 280 
TGGGCAGCGGCGGCTGGCGCTTCCTGCGCATCGTCTGCAA 320 
GACCGCGAGGCGAGACCTCTTCGGTCTCTCTGTGCTGATC 360 
CGCGTGCGCCTGGAGCTGCGGCGGCACCAGCGTGCCGGCC 400 

410 420 430 440 

' ' ' ' I I I I I I I I I I I I I I . I I I I r I I I I I 1 I I I I I I i i i I 

ACACCATCCCGCGCATCTTTCAGGCGGTAGTGCAGCGACA 440 
SCC'CGAGCGCCTGGCGCTGGTGGATGCCGGGACCGGCGAG 480 
TGCTGGACCTTTGCGCAGCTGGACGCCTACTCCAATGCGG 520 
TAGCCAACCTCTTCCGCCAGCTGGGCTTCGCGCCGGGCGA 560 
CGTGGTGGCCATCTTCCTGGAGGGCCGGCCGGAGTTCGTG 600 

610 620 630 640 

■ ■ ' ■ i ■ ■ ■ ' I ■ ' ' ' i ' ' ' ■ I ' ■ * ' i ' ■ ■ ' I ■ < ■ ' * ■ • ■ ' I 

GGGCTGTGGCTGGGCCTGGCC AAGGCGGGCATGGAGGCCG 640 
CGCTGCTCAACGTGAACCTGCGGCGCGAGCCCCTGGCCTT 680 
CTGCC7GGGCACCTCG.GGCGCTAAGGCCCTGATCTTTGGA 720 
GGAGAAATGGTGGCGGCGGTGGCCGAAGTGAGCGGGCATC 760 
TGGGGAAAAGTTTGATCAAGTTCTGCTCTGGAGACTTGGG 800 

810 820 830 840 
i < * ' i ' ■ ■ ' I ■ ■ ' ■ 1 ■ < ' ' 1 ' ■ * ■ i ■ ' ■ ' i • ■ ■ ' i ' ■ t ' 1 

GCCCGAGGGCATCTTGCCGGACACCCACCTCCTGGACCCG 840 
CTGCTGAAGGAGGCCTCTACTGCCCCCTTGGCACAGATCC 880 
CCAGCAAGGGCATGGACGATCGTCTTTT.CTACATCTACAC 920 
GTCGGGGACCACCGGGCTGCCCAAGGCTGCCATTGTCGTG "30 
CACAGCAGGTACTACCGCATGGCAGCCTTCGGCCACCACG iOOO 

1010 1020 1030 1040 

■ ' ' ' 1 ■ ■ ' 1 * ■ ' ■ ' i ■ ' ■ < 1 1 ' ■ ' 1 ' ■ ' ' i ■ ■ ' ■ i ■ ' ■ ' i 
CCTACCGCATGCAGGCGGCTGACGTGCTCTATGACTGCCT 1040 
GCCCCTGTACCACTCGGCAGGAAACATCATCGGCGTGGGG 1080 
CAGTGTCTCATCTATGGGCTGACAGTCGTCCTCCGCAAGA 1 120 
AATTCTCGGCGAGCCGCTTCTGGGACGACTGCATCAAGTA 1 160 
CAACTGCACGGTGGTTCAGTACATCGGGGAGATCTGCCGC 1200 
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1210 1220 1230 1240 
' ' ' ' l ' ' ' ' I ■ ' ' ' > ' ' ' ■ l ' ' ■ ■ ' ■ ■ ■ ■ i t ■ ■ ■ i ■ ■ ■ . i 

TACCTGCTGAAGCAGCCGGTGCGCGAGGCGGAGAGGCGAC 1240 

ACCGCG I GCGCCTGGCGGTGGGGAACGGGCTGCGTCCTGC 1280 

CAl CTGGGAGGAGTTCACGGAGCGCTTCGGCGTACGCCAA 1320 

ATCGGGGAGTTCTACGGCGCCACCGAGTGCAACTGCAGCA 1360 

TTGCCAACATGGACGGCAAGGTCGGCTCCTGTGGTTTCAA 1400 

1410 1420 1430 1440 
' ' ' ' I ■ ' ' ' I ■ ■ ' ' i i <- » ' I ' t . ■ i ■ . . , i ■ . . , i , , , , i 

CAGCCGCATCCTGCCCCACGTGTACCCC ATCCGGCTGGTG 1440 
AAGGTCAATGAGGACAC AATGGAGCTGCTGCGGGATGCCC 1480 
AGGGCCTCTGCATCCCCTGCCAGGCCGGGGAGCCTGGCCT 1520 
CCTTGi GGGTCAGATCAACCAACAGGACCCGCTGCGCCGC 1560 
TTCGATGGCTATGTCAGCGAGAGCGCCACCAGCAAGAAGA 1600 
1610 1620 1630 1640 

' ' ' ' I ' ' ' ' I ' ' ' ' ' ' ' ' ' I ' ' ' ' ' ■ ' ' i I r . . . I i ■ , ■ 1 

TCGCCCACAGCGTCTTC AGCAAGGGCGACAGCGCCTACCT 1640 
CTCAGGTGACGTGCTAGTGATGGATGAGCTGGGCTACATG 1680 
TACTTCCGGGACCGTAGCGGGGACACCTTCCGCTGGCGAG 1720 
GGGAGAACGTCTCCACC ACCGAGGTGGAGGGCGTGCTGAG 1760 
CCGCCTGCTGGGCCAGACAGACGTGGCCGTCTATGGGGTG 1800 

1810 1820 1830 ' 1840 
' ' ' ■ 1 ' ' ■ ' I > ■ ' ' * ■ ' ' ' I ' ■ ■ ■ ' ■ ■ • i I ■ ■ t . i , ' ■ , . I 

GCTGTTCCAGGAGTGGAGGGTAAGGCAGGGATGGCGGCCG 1840 
TCGCAGACCCCC ACAGCCTGCTGGACCCCAACGCGATATA 1880 
CCAGGAGCTGCAGAAGGTGCTGGCACCCTATGCCCGGCCC 1920 
A l CTTCCTGCGCCTCCTGCCCCAGGTGGACACCACAGGCA 1960 
CC I TCAAGATCC AGAAG ACGAGGCTGCAGCGAGAGGGCTT 2000 

2010 2020 2030 2040 
1 * * * l * ' * ' I tiiit » i i ■ I i i i t i 111,1 i i , , i i t i i i 

TGACCCACGCCAGACCTC AGACCGGCTCTTCTTCCTGGAC 2040 

CTGAAGCAGGGCCACTACCTGCCCTTAAATGAGGCAGTCT 2080 

ACACTCGCATCTGCTCGGGCGCCTTCGCCCTCTGAAGCTG 2120 

i TCCTCTACTGGCC ACAAACTCTGGGCCTGGTGGGAGAGG 2160 

CCAGCTTGAGCCAGACAGCGCTGCCCAGGGGTGGCCGCCT 2200 
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2610 2620 2630 2640 

■ ' ■ ' i ■ ■ ■ ■ l ■ ■ ' ■ 1 ' ■ » • l ■ ' ' ' 1 ■ ■ ■ ' I ■ ■ ■ ■ t ■ ■ ■ t I 

GGTCAGGCTGGTCTTGAACTCCTGACCTCAGGTGATCCGC 2640 
TGGCCTCGGCCTCCCAGAGTGCTGGGATTATAGGCGTGAG 2680 
CCTCTGGCCCGGCCTTTCCTTTTTCCTCTCCTCTCCTGCC 2720 
GAGAGTGGAACACACGTGTCCTGGGAGCTGCATCTTGTGT 2760 
AGGGTCCAGCTGCTTTTGGGGACTGC A GGA A TCATCTCCC 2800 

2810 2820 2830 2840 

> * ■ » i ' ■ ' ' I ■ ■ < ■ < i ' ■ ' ' I ' ' ' ' t ' ' ■ > l ■ ■ ■ » i > ' ■ ■ ■ I 
CTGGGCCCTGGACTCGGACTGGGGCCTCCCCACCTCCCTC 2840 
TCGGCTGTGCCTTACGGAGCCCCAATCCAGGCCTCCTGTG 2880 
GCTGTTGGGTTCCAGATGCTGCAGCTCCATGTGACTTCCA 2920 
AGCAGGCCCTCCGCCCTCCCTGCTGAATGGAGGAGCCGGG 2960 
GGTCCCCCAGGCCAACTGGAAAATCTCCCAGGJZTAGGCCA 3000 

3010 3020 3030 3040 

' ' ' ' 1 ' ' ' ' I ' ' ' ' I ' ■ ' ' I ' ' ' ' I ■ ' t I I I I r ■ I I I ■ . I 

ATTGCCTTTTGCACTTCCCCGTTCCTGTCACATTTCCCCA 3040 
GCCCCACCTTCCCCTCCTGATGCCCTGAAAGCTTCCGGAA 3080 
TTGACTGTGACCACTTGGATGTCACCACTGTCAGCCCCTG 3120 
CCTTGATGTCCCCATTTAGCCATCTCCATGGAGCTCCTGC 3160 
TGGAGGGCCCTGAACCCTGCACTGCGTGGCTGCCCAGCCA 3200 

3210 3220 3230 3240 

' T ' ' i ' ' ' ' 1 I .J I t 1 (lit! 1 t T ( 1 1 ) 1 } \ t ! t I ) t I I j 1 

GCTGCCTCCTGTCCTGGGAGGAGGCCTCCTGGGTGTCCTC 3240 
ATCTGGTGTGTCTACTGGAGGGTCCCACAGGAGAGGCAGC 32S0 
AGAGGGGTCAGGGGAGGTCTCCTGCCGGGGGTTGGCCTCT 3320 
CAAGCCTCAGGGGTTCTAGCCTGTTGAATATACCCCACCT 3360 
GGTGGGTGGCCCCTCCGATGTCCCCACTGATGGCTCTGAC 3400 

3410 3420 3430 3440 

■ ■ ■ ■ i ' » ■ ■ I ■ ' ' • 1 ' ' ■ ' I ■ * ■ ' i ' ■ ' ' I ' ' ■ i 1 ■ ■ i ■ I 

ACCGTGTTGGTGGCGATGTCCCAGACAATCCCACCAGGAC 3440 
GGCCCAGACATCCCTACTGGCTTCGCTGGTGGCTCATCTC 3480 
GAACATCCACGCCAGCCTTTCTGGGGCCGGCC ACCCAGGC 3520 
CGCCTGTCCGTCTGTCCTCCCTCCAGCAGCACCCCCTGGC 3560 
CCCTGGAGTGGTGGGGCCATGGCAAGAGACACCGTGGCGT 3600 

3610 3620 3630 3640 

■ ■ ' ' * ■ ■ ■ ' * ' 1 ' ' ' ■ ' ' ■ * ' ■ ■ ■ 1 1 ■ ■ ' l 1 ' ' ■ 1 ■ ■ ■ ■ * 

CTCATGTGAACTTTCCTGGGCACTGTGGTTTTATTTCCTA 3640 
ATTGATTTAAGAAATAAACCTGAAGACCGTCTGGTGAAAA 3680 
AA AAA A A AAA AAA A 3694 
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2610 2620 2630 2640 

t r t t I tritl i i | i | t fill t t \ % | > t t ; 1 t t t t ! i i i | | . 

GGTCAGGCTGGTCTTGAACTCCTGACCTCAGGTGATCCGC 2640 
TGGCCTCGGCCTCCCAGAGTGCTGGGATTATAGGCGTGAG 26S0 
CCTCTGGCCCGGCCTTTCCTTTTTCCTCTCCTCTCCTGCC 2720 
GAGAGTGGA.ACACACGTGTCCTGGGAGCTGCATCTTGTGT 2760 
AGGGTC'CAGCTGCTTTTGGGGACJGCAGGAATCATCTCCC 2800 

2810 2820 2830 2840 

1 t t I 1 » * I I I * 1 1 t I j t t 1 I J t I t I t I I T 1 I I I I 1 I > T I t 

CTGGGCCCTGGACTCGGACTGGGGCCTCCCC ACCTCCCTC 2840 
TCGGCTGTGCCTTACGGAGCCCCAATCCAGGCCTCCTGTG 2S80 
GCTGTTGGGTTCC AGATGCTGCAGCTCC ATGTGACTTCCA 2920 
AGCAGGCCCTCCGCCCTCCCTGCTGAATGGAGGAGCCGGG 2960 
GGTCCCCCAGGCCAACTGGAAAATCTCCCAGGC^AGGCCA 3000 

3010 3020 3030 3040 

t ♦ j * i * » » i i t i * i i i t t i i ) t i t j > « t i i i t i t t ^ t t t i 

ATTGCCTTTTGCACTTCCCCGTTCCTGTCACATTTCCCCA 3040 
GCCCCACCTTCCCCTCCTGATGCCCTGAAAGCTTCCGGAA 3080 
TTGACTGTGACC ACTTGGATGTCACCACTGTCAGCCCCTG 3 i 20 
CCTTGATGTCCCCATTTAGCCATCTCCATGGAGCTCCTGC 3 1 60 
TGGAGGGCCCTGAACCCTGCACTGCGTGGC TGCCCAGCCA 3200 

3210 3220 3230 3240 

t I « I T I I j t I I i i t | f t 1 f [ t t f t I > * T > 1 I » t r t * t * * 1 

GCTGCCTCCTGTCCTGGGAGGAGGCCTCCTGGGTGTCCTC 3240 
ATCTGGTGTGTCTACTGGAGGGTCCCACASGAGAGG-CAGC 3250 
AGAGGGGTCAGGGGAGGTCTCCTGCCGGGGGTTGGCCTCT 3320 
CAAGCCTC AGGGGT TC TAGCCTG TTG A A T'ATACCCCACC T 3360 
GGTGGGTGGCCCCTCCGATGTCCCCACTGATGGGTCTGAC 3400 

3410 3420 3430 3440 

Tlltl *_ 1 ' ^ tittl i i t t I 1 - f t t 1 i i t i 1 I t > t 1 t i > r 1 

ACCGTGTTGGTGGCGA.TGTCCCAGACAATCCCACCAGGAC 3440 
GGCCCAGACATCCCTACTGGCTTCGCTGGTGGCTCATCTC 3480 
GAACATCCACGCCAGCCTTTCTGGGGCCGGCCACCCAGGC 3520 
CGCCTGTCCGTCTGTCCTCCCTCCAGCAGCACCCCCTGGC 3560 
CCCTGGAGTGGTGGGGCCATGGCAAGAGACACCGTGGCGT 3500 

3610 3620 3630 3640 

t t t i 1 * t i i I f t t i I > i i i 1 i t t i I t i r i t i t » t t » t i r 1 

CTCATGTGAACTTTCCTGGGCACTGTGGTTTTATTTCCTA 3640 
ATTGATTTAAG.AAATAAACCTGAAGACCGTCTGGTGAAAA 3680 
A AAA A AAA AA AAA A 3694 
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1 1 1 1 I I I I I I I I I I 1 trill ■ i ■ . I i.i.l .... I 

MRAPGAGAASVVSLALLWLLGLPWTWSAAAALG VY VGSGG 40 
WRFLR I VCKTARRDLFGLS VL I RVRLELRRHQRAGHT I PR 80 
I FQAVVQRQPERLALVDAGTGECWTFAQLDAYSNAVANLF 120 
RQLGFAPGDVVA I FL.EGRPEFVGLWLGLAKAGMEAALLN V 160 
NLRREPLAF CLGTSGAKAL I FGGEMVAAVAE VSGHLGKSL 200 
210 220 230 240 

-■ I-! I I I I I I I I I ■ ■ I . I 1 ■ I . ■ ■ ■ I I , ■ ■ I ■ . , , I | , , , I 

IKFCSGDLGPEGI LPOTHLLDPLLKEASTAPLAQ I PSKGM 240 
DORLFYIYTSGTTGLPKAAI V VHSRYYRMAAFGHHAYRhQ 280 
AADVLYDCLPLYHSAGN I I GVGQCL I YGLT V VLRKKFS AS 320 
RFWDOC I K YNCTVVQY I GE I CR YLLKQP VREAERRHRVRL 360 
AVGNGLRPA I WEEFTERFGVRQ I GEFYGATECNCS I ANMD 400 

410 420 430 440 
' ' ' ■ I ' ' ■ ' I ' ' ■ ' ' ■ ■ » ■ I ' » ■ ■ ' ■ ■ » ■ I ■ ■ ■ ■ i ■ ■ ■ ■ ) 

GKVGSCGFNSRILPHVYPI RLVKVNEDTMELLRDAQGLCI 440 
PCQAGEPGLLVGQ I NGQDPLRRF DGYVSESATSKK I AHSV 480 
FSKGDSAYLSGDVLVMDELGYfiYFRDRSGDTFRWRGENVS 520 
T I EVEGVLSRLLGQTDVAV YGVAVPGVEGKAGMAAVADPH 560 
SLLDPNA I YQELQKVLAPYARP I FLRLUPQ VDTTGTFK I Q 600 

610 620 630 640 
■ • • • ' <'~< ■ < i ■ . ■ . i .,..[,,,. i , , , , i . , , , i .... i 

KTRLQREGFDPRQTSDRLFFLDLKQGHYLPLNEAV YTR I C 640 
SGAFAL 646 
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' ■ * ' 1 ' ' ' * I * ' ■ ■ ' ■ ' ' ■ l ■ ' ■ ' 1 ■ ■ » ■ I ■ ■ ■ ■ i ■ » ■ ■ i 

GGAATTCCAAAAAAAAAAAATACGACTAC ACCTGCTCCGG 40 
AGCCCGCGGCGGTACCTGCAGCGGAGGAGCTCTGTCTTCC 80 
CCTTCATCTCACGCGAGCCCGGCGTCCCGCCGCGTGCGCC 120 
CCGGCGCAGCCCGCCAGTCCGCCCGGAGCCCGCCCAGTCG 160 
CCGCGCTGCACGCCCGGGGTGAACCCTCTGCCCTCGCTGG 200 

210 220 230 240 

■ ' ' ■ 1 t ■ . i I ii. i i i ■ i i 1 i i i i i i i , , i i i i i i . , , . i 

GACAGAGGGCCCCGCAGCCGTCATGCTTTCCGCCATCTAC 240 
ACAGTCCTGGCGGGACTGCTGTTCCTGCCGCTCCTGGTGA 280 
ACCTCTGCTGCCCATACTTCTTCCAGGACATAGGCTACTT 320 
CTTGAAGGTGGCCGCCGTGGGCCGGAGGGTGCGCAGCTAC 360 
GGGCAGCGGCGGCCGGCGCGCACCATCCTGCGGGCGTTCC 400 

410 420 430 440 

' t t ? I i i i i I I f f 1 1 r i i y 1 i t t t I i i t i I i i ? i I - i i i i I 

TGGAGAAAGCGCGCCAGACGCCACACAAGCCTTTTCTGCT 440 
CTTCCGCGACGAGACTCTCACCTACGCGCAGGTGGACCGG 480 
CGCAGCAATCAAGTGGCCCGGGCGCTGCACGACCACCTCG 520 
GCCTGCGCC AGGGAGACTGCGTGGCGCTCCT7ATGGGTAA 560 
CG.AGCCGGCCTACGTGTGGCTGTGGCTGGGGCTGGTGAAG 600 

610 620 630 640 

CTGGGCTGTGCCATGGCGTGCCTCAATTACAACATCCGCG 640 
CGAAGTCCCTGCTGCACTGCTTCCAGTGCTGCGGGGCGAA 680 
GGTGCTGCTGGTGTCGCCAGAACTACAAC-CAGCTGTCGAA 720 
GAGATACTGCC AAGCCTTAAAAAAGATGATGTGTCCATCT 760 
ATTATGTGAGCAGAACTTCTAACACAGATGGGATTGACTC 800 

810 820 830 840 

I 1 1 I 1 1 1 1 T 1 I ! t I 1 1 ' t I 1 t i i i T ' ' ' ' 1 I i t ■ I i I r i 1 

TTTCCTGGACAAAGTGGATGAAGTATCAACTGAACCTATC 840 
CCAGAGTCATGGAGGTCTGAAGTCACTTTTTCCACTCCTG 880 
CCTTATACATTTATAC TTCTGGAACCACA'GGTCTTCCAAA 920 
AGCAGCCATGATCACTCATCAGCGCATATGGTATGGAACT 960 
GGCCTCACTTTTGTAAGCGGATTGAAGGCAGATGATGTCA 1000 

1010 1020 1030 1040 

■ ' ' ■ i ■ > * 1 I i ■ i i i ■ ■ ■ i 1 ■ ■ i i i ■ i i i I i ■ i ■ i i i , ■ i 

TCTATATCACTCTGCCCTTTTACCACAGTGCTGCACTACT 1040 
GATTGGCATTC ACGGATGTATTGTGGCTGGTGCTACTCTT 1080 
GCCTTGCGGACTAAATTTTCAGCCAGCCAGTTTTGGGATG 1 120 
ACTGCAGAAAATACAACGTCACTGTCATTCAGTATATCGG 1 160 
TGAACTGCTTCGGTATTTATGCAACTCACCACAGAAACCA 1200 
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1210 1220 1230 1240 

■ ■ ' ' l ' ' ' > I ' ■ ' ■ i ■ ■ ■ ■ 1 ■ ■ ■ ' 1 ■ ■ ■ < l ■ ■ ■ f i . i i i i 

AATGACCGTGATCATAAAGTGAGACTGGCACTGGGAAATG 1240 

GCTTACGAGGAGATGTGTGGAGACAATTTGTCAAGAGATT 1280 

TGGGGACATATGCATCTATGAGTTCTATGCTGCCACTGAA 1320 

GGCAATATTGGATTTATGAATTATGCGAGAAAAGTTGGTG 1360 

CTGT I GGAAGAGTAAACTACCTACAGAAAAAAATCATAAC 1 400 

1410 1420 1430 1440 

■ ' ■ ' * ' ' ' » I * ' ' ■ i ' ■ ' < l ■ ■ ■ ■ 1 ' ■ ' ■ l ■ * ■ ■ i ■ i t . I 

TTATGACCTGATTAAATATGATGTGGAGAAAGATGAACCT 1 440~ 

GTCCGAGATGAAAATGGATATTGCGTCAGAGTTCCCAAAG 1 480 

GTGAAGTTGGACTTCTGGTTTGCAAAATCACACAACTTAC 1520 

ACCATTTAATGGCTATGCTGGAGCAAAGGCTCAGACAGAG 1560 

AAGAAAAAACTGAGAGATGTCTTT AAGAAAGGAGACCTCT 1600 

1610 1620 1630 . 1640 

1 1 1 1 I l i i i I i i i i I i i i i I t i i i I i > i r I ■ i i i 1 i i i i I 

ATTTCAACAGTGGAGATCTCTTAATGGTTGACCATGAAAA 1640 
T TTC A TC TAT T TC C AC G AC AG AG TTGG AG AT AC AT TCCGG 1680 
TGGAAAGGGGAAAATGTGGCCACCACTGAAGTTGCTGATA 1720 
CAGTTGGACTGGTTGATTTTGTCCAAGAAGTAAATGTTTA 1760 
TGGAGTGCATGTGCCAGATCATGAGGGTCGCATTGGCATG 1800 

1810 1820 1830 1840 

■ 1 ' ' i ' ■ * ' 1 ■ ■ ■ ■ 1 * ' ■ * I ■ » ■ ' 1 ■ ■ ■ ' i ■ ■ ■ < i ■ » ■ ■ i 

GCCTCCATCAAAATGAAAGAAAACCATGAA.TTTGATGGAA 1840 
AGAAACTCTTTCAGCACATTGCTGATTACCTACCTAGTTA 1880 
TGCAAGGCCCCGGTTTCTAAGAATACAGGACACCATTGAG 1 S20 
ATCACTGGAACTTTTAAACACCGCAAAATC-ACCCTGGTGG 1960 
AGGAGGGCTTTAACCCTGCTGTCATCAAAGATGCCTTGTA 2000 

2010 2020 2030 2040 

■ i i i t I i i t ' I ■ ' ■ ' 1 ' ' ■ ■ i ■ ■ ■ ' i ■ > ' ■ I ■ ■ i ■ I ■ i i ■ I 

TTTCTTGGATGACACAGCAAAAATGTATGTGCCTATGACT 2040 
GAGGACATCTATAATGCCATAAGTGCTAAAACCCTGAAAC 2080 
TCTGAATATTCCCAGGAGGATAACTCAACATTTCCAGAAA 2 120 
GAAACTGAATGGACAGCC ACTTGATATAATCCAACTTTAA 2160 
TTTGATTGAAGATTGTGAGGAAATTTTGTAGGAAATTTGC 2200 

2210 2220 2230 2240 
' ■ ' 1 1 ' ■ ■ ' * ' ' ' ■ 1 ■ 1 ■ ■ I ■ ' • ■ 1 ■ • ■ ■ 1 ■ ■ ' ■ i ■ ■ ■ ' I 

ATACCCGTAAAGGGAGACTTTTTTAAATAACAGTTGAGTC 2240 
TTTGCAAGTAAAAAGATTTAGAGATTATTATTTTTCAGTG 2280 
TGCACCTACTGTTTGTATTTGCAAACTGAGCTTGTTGGAG 2320 
GGAAGGCATTATTTTTTAAAATACTTAGTAAATTAAATGA 2360 
AC 2362' 
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MLS A I YTVLAGLLFLPLLVNLCCPYFFQD I GYFLKVAAVG 40 
RRVRSYGQRRPART I LRAFLEKARQTPHKPFLLFRDETLT 80 
YAQVDRRSNQVARALHDHLGLRQGDCVALLMGNEPAYVWI 120 
WLGLVKLGCAMACLNYN I RAKSLLHCFQCCGAKVLLVSPE 160 
LQAAVEE ILPSLKKDDVSIYYVSRTSNTDGI DSFLDKVDE 200 

210 220 230 240 

' ' ' ' I ■ ' ' ' I ' ' ' ■ ' ■ ' ■ ■ I ' ' ' ■ ' * ■ ■ ■ I > ■ ■ ■ i ■ ■ ■ t I 

VSTEP I PESWRSEVTFSTPALY I YTSGTTGLPKAAM I THQ 240 
R I WYGTGLTFVSGLKADDV I Y I TLPFYHSAALL I G I HGC I - 2B0 
VAGATLALRTKFSASQFWDDCRKYNVTV I QY I GELLRYLC 320 
NSPQKPNDRDHK VRLALGNGLRGDVWRQFVKRFGD I C I YE 360 
FYAATEGN I GFMNY ARK VGAVGRVNYLQKK I I TYDL I K YD 400 

410 420 430 440 
ill t i i t I i t t i i i i t t I t i i i i i i i i i ■ i i , i . , . . i 

VEKDEPVRDENGYCVRVPKGEVGLLVCK I TQLTPFNGYAG 440 
AKAQTEKKKLRO VFKKGOLYFNSGDLLMVDHENF I YFHDR 480 
VGO l FRWKG.ENVATTEVADTVGLVDFVQEVNVYGVHVPDH 520 
EGR I GMAS EKMKENHEFDGKKLFQHI AD YLPS YARPRFLR 560 
I QDT I E I TGTFKHRKMTLVEEGFNPAV I KDALYFLDDTAK 600 
610 620 630 640 

' ' ' ' I ' ' ' ' 1 ' ' ' ' I ' ' ' ' I ' ' ' ■ I T t I I | | | , ■ 1 , , , , | 

MYVPMTEDIYNAISAKTLKL 620 
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AAG77C7CGGC7GG7CAG77C7GGGAAGAT7GCCAGCAGC 40 

ACAGGGTGACGGTGTTCC AGTACATTGGGGAGCTGTGCCG 80 

ATACCTTGTCAACCAGCCCCCGAGCAAGGCAGAACGTGGC 120 

CATAAGGTCCGGCTGGCAGTGGGCAGCGGGCTGCGCCCAG 160 

ATACCTGGGAGCGTTTTGTGCGGCGCTTCGGGCCCCTGCA 200 

210 220 230 240 

■ ■ ■ ' 1 1 ' > ■ l ■ ■ ' ' ' ' ■ ■ ■ i ' ■ ■ ■ 1 ■ ■ ■ ■ I ■ ■ ■ ■ i ' ■ ■ ■ i 

GGTGCTGGAGACATATGGACTGACAGAGGGCAACGTGGCC 240 
ACCATCAACTACACAGGACAGCGGGGCGCTGTGGGGCGTG 280 
CTTCCTGGCTTTACAAGCATATCTTCCCCTTCTCCTTGAT 320 
TCGCTATGATGTCACCACAGGAGAGCCAATTCGGGACCCC 360 
CAGGGGCACTGTATGGCCACATCTCCAGGTGAGCCAGGGC 400 

•410 420 430 440 

' ^ t t I t > i t 1 r t t i T r i r t I t r r r T > f f f I i f f t 1 r > i t 1 

TGCTGGTGGCCCCGGTAAGCCAGCAGTCCCCATTCCTGGG 440 
CTATGCTGGCGGGCCAGAGC7GGCCCAGGGGAAGTTGCTA 480 
AAGGATGTCTTCCGGCCTGGGGATGTTTTCTTCAACACTG 520 
GGGACCTGCTGGTCTGCGATGACC AAGGTTTTCTCCGCTT 560 
CCATGATCGTACTGGAGACACCTTCAGGTGGAAGGGGGAG 60O 

610 620 630 640 

■ ' ' ■ ' 1 ' ■ ' I ' ■ ' ' 1 ■ ' ■ ' I ■ ■ ■ ■ 1 » ■ ' ' I ■ ' ■ ■ 1 ' ' ■ > i 

AATGTGGCCACAACCGAGGTGGCAGAGGTCTTCGAGGCCC 640 
TAGATTTTCTTCAGGAGGTGAACGTCTATGGAGTCACTGT 680 
GCCAGGGCATGAAGGCAGGGCTGGAATGGCAGCCCTAGTT 720 
CTGCGTCCCCCCCACGCTTTGGACCTTATGCAGCTCTACA 760 
CCCACGTGTCTGAGAACTTGCCACCTTATGCCCGGCCCCG 800 

8 10 820 830 840 
i i i i 1 1 ' i i I i i i i 1 1 ■ ■ 1 I i i t t i i i i t l i t i i i i i i i i 

ATTCCTCAGGCTCCAGGAGTCTTTGGCCACCACAGAGACC 840 
TTCAAACAGC AGA.AAGTTCGGATGGCAAATGAGGGCTTCG 880 
ACCCCAGCACCCTGTCTGACCCACTGTACGTTCTGGACCA 920 
GGCTGTAGGTGCCTACCTGCCCCTCACAACTGCCCGGTAC 960 
AGCGCCCTCCTGGCAGGAAACCTTCGAATCTGAGAACTTC 1000 

1010 1020 . 1030 1040 

' ' * ' 1 ■ * ■ ■ I ' ■ ■ * 1 ■ 1 ■ ■ I ' ' ' ■ 1 » ■ ' ' I ■ ' * ■ ' ' ■ ■ ■ I 

CACACCTGAGGCACCTGAGAGAGGAACTCTGTGGGGTGGG 1040 
GGCCGTTGCAGGTGTACTGGGC7GTCAGGGATCTTTTCTA 1080 
TACCAGAACTGCGGTCAC7A7T7TGTAATAAATGTGGCTG 1 1 20 
GAGC7GA7CCAGC7G7C7C7GACAAAAAAAAAAAAAAAAA 1 1 60 
AAAGGGCGGCCGC 1173 
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KFSAGQFWEDCQQHRVTVFQY I GELCRYLVNQPPSKAERG 40 
HKVRLAVGSGLRPDTWERFVRRFGPLQVLETYGLTEGNVA 80 
T I NYTGQRGAVGRASWLYKH I FPFSL I RYDVTTGEP I ROP 120 
QGHCMATSPGEPGL.LVAPVSQQSPFLGYAGGPELAQGKLL 160 
KDVFRPGDVFFNTGDLLVCDDQGFLRFHDRTGDTFRWKGE 200 

210 220 230 240 
' ' ' ■ * ' ■ ' ■ -l ' ' ■ ' 1 * ' ' ■ I ■ ■ • ■ i ■ i ■ ' l ■ ■ ■ ■ i i i i i I 

NVATTEVAEVFEALDFLQEVNVYGVTVPGHEGRAGMAALV 240 

LRPPHALDLMQLYTHVSENLPPYARPRFLRLQESLATTET 280 

FKQQK VRMANEGFDPSTLSDPLYVLDQAVGAYLPLTTARY 320 
SALLAGNLR I 330 
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.CGACCCACGCGTCCGGGCGGGCGGGGCCGGGCGGCGGGCG 40 
GGGCTGGCGGGGCGGCCGGGCCATGCAGGGCGCAGAGCCG 80 
GCTAAACCCTGCTGAGACCCGGCTCCGTGCGTCCAGGGGC 120 
GGCTAATGCCCCTCACGCTGTCTACGCTGCTGCAACCGGG 160 
CCGCATCTGGACGGGGCGCCGCGCGGCGGAGCCGACGCCG 200 

210 220 230 240 
,' 1 1 ' 1 ' ' 1 ' I ' ■ ' ■ 1 ■ ' ' ' 1 ■ ■ ■ ■ ' t ■ ■ . i , , , , i , , , , ( 

GGCCACAATGCTGCTTGGAGCCTCTCTGGTGGGGGTGCTG 240 
CTGTTCTCCAAGCTGGTGCTGAAACTGCCCTGGACCCAGG 280 
TGGGATTCTCCCTGTTGTTCCTCTACTTGGGATCTGGCGG 320 
CTGGCGCTTCATCCGGGTCTTCATCAAGACCATCAGGCGC 360 
GATATCTTTGGCGGCCTGGTCCTCCTGAAGGTGAAGGCAA 400 

410 420 430 440 

' ' ' ' I ■ ■ ■ ' I ■ ■ » ■ ' ■ ■ ■ ' l ' ' ■ ■ ' ■ ■ > ■ l ■ ■ ■ ■ i ■ ■ ' < i 

AGGTGCGACAGTGCCTGCAGGAGCGGCGGACAGTGCCCAT 440 
TTTGTTTGCCTC TACCGTTCGGCGCCACCCCGACAAGACG 480 
GCCCTGATCTTCGAGGGCACAGATACCCACTGGACCTTCC 520 
GCCAGCTGGATGAGTACTCAAGGAGTGTAGCCAACTTCCT 560 
GCAGGCCCGGGGCCTGGCCTCGGGCGATGTGGCTGCCATC 600 

610 620 630 640 
' ' ■ ' I i i i i I iitti ii ■ t I i ■ i . i t , , , I , , , , i , , , , | 

TTCATGGAGAACCGCAATGAGTTCGTGGGCCTATGGCTGG 640 
GCATGGCCAAGCTCGGTGTGGAGGCAGCCCTCATCAACAC 680 
CAACCTGCGGCGGGATGCTCTGCTCCACTGCCTCACCACC 720 
TCGCGCGCACGGGCCCTTGTCTTTGGCAGCGAAATGGCCT 760 
CAGCCATCTGTGAGGTCCATGCCAGCCTGGACCCCTCGCT 800 
810 820 830 840 

1 ! 1 * 1 tilt) i i t i I t t t i t i > > i 1 i t t > I t i t > ] 

CAGCCTCTTCTGCTCTGGCTCCTGGGAGCCCGGTGCGGTG 840 
CC I CCAAGCACAGAACACCTGGACCCTCTGCTGAAAGATG 880 
CTCCCAAGCACCTTCCC AGTTGCCCTGACAAGGGCTTCAC 920 
AGA l AAACTG ! i CTACATCTACACATCCGGCACCACAGGG 960 
CTGCCCAAGGCCGCCATCGTGGTGCACAGCAGGTATTACC 1000 

1010 1020 1030 1040 
' ' ' ' I ' ' ' ■ I ■ ' ■ ' ' ■ ' ■ ' I > ' ' ' ' < ■ ■ • i ■ ' ■ ■ i ■ • ■ • i 

GCATGGCTGCCCTGGTGTACTATGGATTCCGCATGCGGCC 1040 
CAACGACATCGTCTATGACTGCCTCCCCCTCTACCACTC A 1080 
GCAGGAAACATCGTGGGAATCGGCCAGTGCCTGCTGCATG 1 120 
GCA I GACGGTGGTGATTCGGAAGAAGTTCTCAGCCTCCCG 1 160 
GTTC I GGGACGATTGTATCAAGTACAACTGCACGATTGTG 1200 
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CAGTACATTGGTGAACTGTGCCGCTACCTCCTGA~ACC AGC 1240 
CACCGCGGGAGGCAGAAAACCAGCACCAGGTTCGCATGGC 1280 
ACTAGGCAATGGCCTCCGGCAGTCCATCTGGACCAACTTT 1320 
TCCAGCCGCTTCCACATACCCCAGGTGGCTGAGTTCTACG 1360 
GGGCCACAGAGTGC AACTGTAGCCTGGGCAACTTCGAC AG 1400 

1410 1420 1430 1440 

] ti } 1 \ , \ 1 1 J i \ ! 1 ! 1 I 1 t t I 1 1 , 1 t I 1 1 T t 1 I t I | ! Till! 

CCAGGTGGGGGCCTGTGGTTTCAATAGCCGCATCCTGTCC 1440 
TTCGTGTACCCCATCCGGTTGGTACGTGTC AACGAGGACA 1480 
CCATGGAGCTGATCCGGGGGCCCGACGGCGTCTGCATTCC 1 52Q— 
CTGCCAGCCAGGTGAGCCGGGCCAGCTGGTGGGCCGCATC 1560 
ATCCAGAAAGACCCCCTGCGCCGCTTCGATGGCTACCTC A 1600 

1610 . 1620 1630 1640 

' ' ' ' ? < T T f 1 T I r f I T } T 1 I t T I T 1 I t » , | f t T i | , , , t 1 

ACCAGGGCGCCAACAACAAGAAGATTGCCAAGGATGTCTT 1640 
CAAGAAGGGGGACCAGGCCTACCTTACTGGTGATGTGCTG 1680 
GTGATGGACGAGCTGGGCTACCTGTACTTCCGAGACCGC A 1720 
CTGGGGAC ACGTTCCGCTGGAAAGGTGAGAACGTGTCC AC 1760 
CACCGAGGTGGAAGGCACACTCAGCCGCCTGCTGGACATG 1800 

1810 1820 1830 1840 

t T I I ? ! I > I I 1 t 1 1 I I I 1 » I t 1 I I 1 J t T I 1 I f 1 f 1 I ! t 1 1 

GCTGACGTGGCCGTGTATGGTGTCGAGGTGCC AGGAACCG 1840 
AGGGCCGGGCCGGAATGGCTGCTGTGGCCAGCCCCACTGG 1880 
CAACTGTGACCTGGAGCGCTTTGCTCAGGTCTTGGAGAAG 1920 
GAACTGCCCCTGTATGCGCGCCCCATCTTCCTGCGCCTCC 1960 
TGCCTGAGCTGC AC AAAACAGGAACCT AC AAGT7CCAG AA 2000 

2010 2020 2030 2040 

' ' ' ' 1 ' ' ' ■ I ■ ' ■ ' F ' ■ ' ' I ' ' ' ' I ' ' ' ' I ' ■ ■ ■ I ■ ■ ' ' I 

GACAGAGCTACGGAAGGAGGGCTTTGACCCGGCTATTGTG 2040 
AAAGACCCGCTGTTCT ATCTAGATGCCC AGAAGGGCCGC T 2080 
ACGTCCCGCTGGACCAAGAGGCCTACAGCCGCATCCAGGC 2120 
AGGCGAGGAGAAGCTGTGATTCCCCCCATCCCTCTGAGGG 2160 
CCGGCGGATGCTGGATCCGGAGCCCCAGGTTCCGCCCCAG 2200 
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AGCGGTCCTGGACAAGGCCAGACC AAAGCAAGCAGGGCCT 2240 
GGCACCTCCATCCTGAGGTGCTGCCCCTCCATCCAAAACT 2280 
"GCCAAGTGACTCATTGCCTTCCCAACCCTTCCAGAGGCTT 2320 
TCTGTGAAAGTCTCATGTCCAAGTTCCGTCTTCTGGGCTG 2360 
GGCAGGCCCTCTGGTTCCCAGGCTGAG'ACTGACGGGTTTT 2400 

2410 2420 2430 2440 

F > ? ? I t f I I 1 . I. ! 1 1 I 1 t t T I 1 t > f I ! ' 1 1 ! ' i T ? I T I 1 1 I 

CTCAGGATGATGTCTTGGGTGAGGGTAGGGAGAGGACAAG 2440 
GGGTCACCGAGCCCTTCCCAGAGAGCAGGGAGCTTATAAA 2480 
TGGAACCAGAGC AGAAGTCCCCAGACTCAGGAAGTCAACA 2520 
GAGTGGGCAGGG AC AGTGGTAGCATCCATCTGGTGGCC AA 2560 
AGAGAATCGTAGCCCC AG AGCTGCCCAAGTTCAC TGGGCT 2600 



FIG. 50C 



SUBSTITUTE SHEET (RULE 26) 



107/202 



2610 2620 2630 2640 

' I * ' I T T ! f I I I t t I I 1 I I 1 I t 1 I I 1 1 I I I I t 1 t I l t I * I 

CCACCCCCACCTCCAGGAGGGGAGGAGAGGACCTGACATC 2640 
TGTAGGTGGCCCCTGATGCCCCATCTACAGCAGGAGGTCA 2680 
GGACCACGCCCCTGGCCTCTCCCCACTCCCCCATCCTCCT 2720 
CCCTGGGTGGCTGCCtGATTATCCCTCAGGCAGGGCCTCT 2760 
CAGTCCTTGTGGGTCTGTGTCACCTCCATCTCAGTCTTGG 2800 

2810 2820 2830 2840 

i i i i I i * t t I i t t t t » t i i { r t i i I t i r i 1 > .i i i F r i i i 1 

CCTGGCTATGAGGGGAGGAGGAATGGGAGAGGGGGCTCAG 2840 
GGGCCAATAAACTCTGCCTTGAGTCCTCCTAAAAAAAAAA 2880 
AAAAAAAAAAAAAAAAAAAAAAAAAAA 2907 
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MLLGASLYGVLLFSKLVLKLPWTQVGFSLLFLYLGSGGWR 40 
F i RVF I KT I RRD I FGGLVLLK VKAK VRQCLQERRTVP I LF 80 
ASTVRRHPDKTAL f FEGTOTKWTFRQLDE YSSS VANFLQA 120 
RGLASGO VAA I FMENRNEFVGLWLGMAKLG VEAAL I NTNL 160 
RRDALLHCLTTSRARALVFGSSMASAICEVHASLDFSLSL 200 
210 220 230 240 

I I I 1 1 1 1 ! T T 1 l > t i f i ? i i 1 t t i i I i t > t | T i t | | , i t , 1 

FCSGSWEPGAVPPSTEHLD'PLLKDAPKHLPSCPDKGFTDK 240 
LFYI YTSGTTGLPKAAI VVHSRYYRMAALVYYGFRMRPND 2S0 
I VYDCLPLYHSAGNI VG IGQCLLHGMTVV IRKKFSASRFW 320 
DDCIKYNCTI VQY I GELCR YLLNQPPREAENQHQ VRMALG 360 
NGLRQS IWTNFSSRFHIPQVAEFYGATECNCSLGNFDSQV 400 
410 420 430 440 

* * t < 1 r t t T I i t t i I t » t t 1 r t r t f i > i t ! t t t i I . ? r t 1 

GACGFNSRILSF VYP I RLVRVNEDTMEL I RGPDGVC IPCQ 440 
PGEPGQLVGR I I QKDPLRRFOGYLNQGANNKK I AKDVFKK 430 
GQQAYLTGOVL VMOELG YLYFRDRTGQTFRWKGENVSTTE 520 
VEGTLSRLLDMAOVAVYGVEVPGTEGRAGMAAVASPTGNC 560 
DLERFAQVLEKELPL YAR? i FLRLLPELH'KTGTYKFGKTE 600 

610 620 630 640 

i t r I I i i i i I i i r i I i t i i 1 f i t i 1 i i i t I t r t i T i i t r I 

LRKEGFOPAI VKOPLFYLDAGKGRYVPLDQEAYSR LGAGE 640 
EKL 643 
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GTCGTTGGGATCCTCGGCTGCTTAGATCTCGGAGCCACCTGTGTTCT 

GGCCCCCAAGTTCTCTACTTCCTGCTTCTGGGATGACTGTCGGCAGC 

ATGGCGTGACAGTGATCCTGTATGTGGGCGAGCTCCTGGGATACTTG 

TGTAACATTCCCCAGCAACCAGAGGACCGGACACATACAGTCCGCC 

TGGCAATGGGCAATGGACTACGGGCTGATGTGTGGGGAGACCTTCC 

AGCAGCGTTTCGGTCCTATTTCGGATCTNGGGAAGTCTTACGGGCTT 

CCACAGAAGGGCAACATGGGGCTTTAGTTCAAATATTGTTGGGGGC 

GCTGCGGGGCCCTGGGGGCAAAGATGGAGCTTGCCTCCTCCGAATG. 

CTGTCCCCCTTTGAGCTGGTGCAGTTCGACATGGAGGCGGCGGAGC 

CTGTGAGGGACAATCAGGGCTTCTGCATCCCTGTAGGGCTAGGGGA 

GCCGGGGCTGCTGTTGACCAAGGTGGTAAGCCAGCAACCCTTCGTG 

GGCTACCGCGGCCCCCGAGAGCTGTCGGAACGGAAGCTGGTGCGCA 

ACGTGCGGCAATCGGGCGACGTTTACTACAACACCGGGGACGTACT 

GGCCATGGACCGCGAAGGCTTCCTCTACTTCCGCGACCGACTCGGG 

GACACCTTCCGATGGAAGGGCGAGAACGTGTCCACGCACGAGGTGG 

AGGGCGTGTTGTCGCAGGTGGACTTCTTGCAACAGGTTAACGTGTAT 

GGCGTGTGCGTGCCAGGTTGTGAGGGTAAGGTGGGCATGGCTGCTG 

TGGCATTAGCCCCCGGCCAGACTTTCGACGGGGAGAAGTTGTACCA 

GCACGTTCGCGCTTGGCTCCCTGCCTACGCTACCCCCCATTTCATCC 

GCATCCAGGACGCCATGGAGGTCACCAGCACGTTCAAACTGATGAA 

GACCCGGTTGGTGCGTGAGGGCTTCAATGTGGGGATCGTGGTTGAC 

CCTCTGTTTGTACTGGACAACCGGGCCCAGTCCTTCCGGCCCCTGAC 

GGCAGAAATGTACCAGGCTGTGTGTGAGGGAACCTGGAGGCTCTGA 

TCACCTGGCCAACCCACTGGGGTAGGGATCAAAGCCAGCCACCCCC 

ACCCCAACACACTCGGTGTCCCTTTCATCCTGGGCCTGTGTGAATCC 

CAGCCTGGCCATACCCTCAACCTCAGTGGGCTGGAAATGACAGTGG 

GCCCTGTAGCAGTGGCAGAATAAACTCAGMTGYGTTCACAGAAA 
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VVG I LGCLDLGATCVLAPKF STSCFWDDCRQHGVTV I LYV 40 
GELLRYLCN I PQQPEDRTHTVRLAMGNGLRAD VWGDLPAA 80 
FRSYFGSXEVLRASTEGQHGALVQ 1 LLGALRGPGGKDGAC 120 
LLRMLSPFELVQFDMEAAEPVRDNQGFC IPVGLGEPGLLL 160 
TKVVSQQPFVGYRGPRELSERKLVRNVRQSGDVYYNTGDV 200 

210 220 230 240 

■ ' ' ' 1 » ' ' ■ I ' ' ' > i ' ■ * ' I * ■ ' ' i ' ' ■ ' I ■ ■ ' ■ 1 1 * ■ * I 

LAMDREGFLYFRDRLGDTFRWKGENVSTHEVEGVLSQVDF 240 
.LQQVNVYGVCVPGCEGKVGMAAVALAPGQTFDGEKLYQHV 280 
RAWLPAYATPHF I R I QDAMEVTSTFKLMKTRLVREGFNVG 320 
I VVDPLFVLDNRAQSFRPLTAEMYQAVCEGTWRL 354 
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AACGGCAAGTAAGCGCAACGCAATTAATGTGAGTAGCTCA 40 
CTCA7TAGGCACCCCAGGCTTTACACTTTATGCTTCCGGG 80 
CTCGl ATGTTGTGTGGAATTGTGAGCGGATACCAATTTCA 120 
C AC AGGAACCAGC T A TG AC A TG AT TACGAATTT A AT AC G A 160 
CTCACTA t AGGG A AT T TGG CC C TCGAGGCC A AG A AT TCGG 20O 
210 220 230 240 

' ' ' ' I ' ' ' ' ) ' ' ' ' 1 ' ' ' ' I ' ' ' ' I I I ■ ■ I I ■ T I I , , . , | 

CACGAGGGGTGCTGAGCCCCTGCGCGGTTTCTGGTGCGTA 240 
GAGACTGTAAATCGCTGCGCTTCTCAGTCATCATCATCCC 280 
AGCTTTTCCCGGCTCGAATTCAGCCTCCAACTCAAGCTCG 320 
CGGGAAAGACTACCTGAGAGGAGAAAAGCTTCTGTCCCTG 360 
GACC i TC I I C I GAGGGTGGAGTCGGAGGCTCCCTGCTTTC 400 
410 420 430 440 

^ T ^ T f ? I 1 I ! I t i i f t t r | 1 i i > i f t , , i [ , , t , ) , , , , | 

CAGCCGCCC AG TGACCCAAGCTTAATCTTCAGCACCACTT 440 
GGGGCGACCTTTTCGGTGCAAACCTACGATTCTGTTTCTC 480 
AGG ATTCCTCCCC ATCCCGCTTCGCCCCGGAAAAGC TG AC 520 
AAGAACl l-CAGGTGTAAGCCCTGAGTAGTGAGGATCTGCG 560 
GTC TCCGTGGAGAGCTGTGCCTGGAAGAGAAGGACGCTGG 600 
610 620 630 640 

t ! r 1 f ? * > t [ ? r t ) \ t t i i | t r . T I t l i i f ) t f t [ ttfil 

TGGGGGCTGAGATCAGAGCTGTCTTCTGGCCCAGTTGCCC 640 
CCA i GCTTC TGTC ATGGCTAAC AGTTCTAGGGGCTGGAAT 680 
GG^CGTCCTGCACTTCTTGCAGAAACTCCTGTTCCCTTAC 720 
i TGGGAi GACTTCTGGTTCGTGTTGA.AGGTGGTGCTCA 760 
I l AT A A i I CGGCTGAAGAAGTATGAAAA'GAGAGGGGAGCT SCO 
8 10 820 83C 840 

* ' * T I 1 1 T ' I I t 1 ' 1 » t t i I t F t t t i i i t 1 , i | i I , f T i j 

GGTGACTGTGCTGGATAAATTCTTGAGTCATGCCAAAAGA 840 
CAACCTCGGAAACCTTTCATCATCTATGAGGGAGACATCT 880 
ACACC LATCAGGATGTAGACAAAAGGAGCAGCAGAGTGGC 920 
CCATG I C I TCCTGAACCATTCCTCTCTGAAAAAGGGGGAC 960 
ACGG i GGC TCTGCTG A TGAGCA A TGAGCCGG A CTTCGTTC 1000 
1010 ' 1020 1030 1040 

* 1 * * 1 f » I » 1 t i t » 1 , r | » 1 .till t 1 » . i 1 , i i t | , , , , 1 

ACGTGTGGTTCGGCCTCGCGAAGCTGGGCTGCGTGGTGGC 1040 
Cl i TC I CAACACC AAC ATTCGC TCCAACTCCCTCCTGAAT 1080 
l GC A TCCGCGCCTGTGGGCCC AG AGCCC TAG TGGTGGGCG 1 1 20 
CAGATT I GC t TGGAACGGTAGAAGAAATCCTTCCAAGCCT 1 1 60 
C ! C AG AAA A I ATC A GTG TTTGGGGG A TG A AAGATTCTGTT 1200 
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CC ACAAGGTGTAATTTC ACTCAAAGAAAAACTGAGCACCT 1240 
CACCTGATGAGCCCGTGCCACGCAGCCACCATGTTGTCTC 1280 
ACTCCTCAAG I C i ACTTGTCTTTACATTT.TTACCTCTGGA 1320 
ACAACAGGTCTACCAAAAGCAGCTGTGATTAGTCAGCTGC 1360 
AGGTTTTAAGGGGTTCTGCTGTCCTGTGGGCTTTTGGTTG 1400 
1410 1420 1430 1440 

' t ' t j T f I I I t t T t I t T t ? I f f T t I t f I t I f 1 | f | , | t , | 

TACTGCTCATGACATTGTTTATATAACCCTTCCTCTGTAT 1440 
CATAGTTCAGCAGCTATCCTGGGAATTTCTGGATGTGTTG 1480 
AGTTGGGTGCC ACTTGTGTGTTAAAGAAGAAATTTTC AGC 1520 
AAGCCAGTTT I GGAGTGACTGCAAGAAGTATGATGTGACT 1560 
GTGTTTCAGTATATTGGAGAACTTTGTCGCTACCTTTGCA 1600 
1610 1620 1630 1640 

'Till t ' 1 * I > i f t I i | f > 1 ) r i t 1 i f i | I t i t t ( i | f I I 

AACAATCTAAGAGAGAAGGAGAAAAGGATCATAAGGTGCG 1640 
TTTGGCAATTGGAAATGGCATACGGAGTGATGTATGGAGA 1680 
GAATTTTTAGACAGATTTGGAAATATAAAGGTGTGTGAAC 1720 
TTTATGCAGCTACCGAATCAAGCATATCTTTCATGAACTA 1760 
CACTGGGAGAATTGGAGC AATTGGGAGAAC AAATTTGTTT 1S00 

1810 1820 1830 1840 

Til*! f ' ' * 1 t ' * » I i > t t I I t > t 1 i t T t i l I | i I t l t t 1 

TACAAAC TTC TIT CC AC ITT TG AC TTAATAAAGTATG AC T 1840 
TTCAGAAAGATGAACCC ATG AGAAATGAGC AGGGTTGGTG 1580 
TATTCATGTGAAAAAAGGAGAACC TGG AC TTC TC ATT TCI 1920 
CGAGTGAATGC AAAAAATCCC TTCTTTGGCTATGC TGGGC 1960 
C T T AT A A G C AC AC AAAAGAC AAA TTGCTTTGTGATGTTTT 2000 

2010 2020 2030 2040 

t _ l . t * I t i t t [ t * i t | i t i i f t i i y f t > t t f i f t t f * > t i i 

TAAGA'AGGGAG ATGTTTACC TTAATAC TGGAG ACTTAATA 2040 
GTCC AGGATC AGG AC AATTTCC TTTATTTTTGGGACCGTA 2080 
CTGGAGACAC TTTC AGATGGAAAGGAGAAAATGTCGC AAC 2 120 
CACTGAGGTTGC TGATGTTATTGGAATGTTGGATTTCATA 2 160 
CA'GGAAGCAAACGTCTATGGTGTGGCTATATCAGGTTATG 2200 
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AAGGAAGAGC AGGAATGGCTTCTATTATTTTAAAACCAAA 2240 
TACATCTTTAGATTTGGAAAAAGTTTATGAACAAGTTGTA 2280 
ACATTTCTACCAGCTTATGCTTGTCCACGATTTTTAAGAA 2320 
TTC AGG AA AA A ATGG A A GC A AC AGGAACATTC A A AC T-ATT 2360 
GAAGCATC AGTTGGTGG AAGAT.GGATTTAATCC ACTGAAA 2400 

2410 2420 2430 2440 

I I t I 1 I I 1 t I I I 1 I I I 1 I T I I I I I [ I I I I 1 I t < I I 1 I t I I 

ATTTCTGAACCACTTTACTTCATGGATAACTTGAAAAAGT 2440 
CTTATGTTCTACTGACC AGGGAACTTTATGATC AAATAAT 2480 
GTTAGGGGAAATAAAACTTTAAGATTTTTATATCTAC-AAC 2520 
TTTC ATATGCTTTCTTAGGAAGAGTGAGAGGGGGGTATAT 2560 
GATTCTTTATGAAATGGGGAAAGGGAGCTAACATTAATTA 2600 
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2610 2620 2630 2640 

r t i * ! i t i t 1 ? t i t I t r * t i i i ? t 1 > t t t I t t i t | i i t f 1 

TGCATGTACTATATTTCCTTAATATGAGAGATAATTTTTT 2640 
^ATTGCATAAGAATTTTAATTTCTTTTAATTGATATAAAC 2680 
ATTAGTTGATTATTCTTTTTATCTATT I GGAGA I 1CAG1G 2720 
CATAACTAAGTATTTTCCTTAATACTAAAGATTTTAAATA 2760 
ATAAATAGTGGCTAGCGGTTTGGACAATCACTAAAAA I G ! 2800 

2810 2820 2830 2840 

I ! ! f I 1 I I T I 1 1 I T 1 t 1 t * 1 * ' t T 1 i I T I I I T ! 1 ! t f T 1 1 

ACTTTCTAATAAGTAAAATTTCTAATTTTGAATAAAAGAT 2840 
TAAAT'TTTACTGAAAAAAAAAAAAAAAAAAAAAATTGGCG 2880 
GCCGC 2885 
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MLLSWLTVLGAGM VVLHFLQKLLFPYFWDDF WF VLKVVL I 40 

I IRLKKYEKRGELVTVLDKFLSHAKRQPRKFFi I YEGOIY 80 

TYQDVDKRSSRVAH^FLNHSSLKKGOTVALLMSNEPDFVH 120 

VWFGLAKLGC V VAFLNTN ! RSNSlLNC I RACGPR ALV VGA 160 

OLLGTVEE I LPSLSEN I SVWGMKQSVPGGV ISLKEKLSTS 200 

210 220 230 240 

t 1 t t t I 1 > t \ 1 t I t T f t ? ^ t 1 t I t t 1 t I t I 1 t I I I I t ' I * t _. 

P0EPVPRSHHVVSLLKSTCLYIFT5GTTGLPKAAVISQLQ 240 
VLRGSAVLWAFGCTAHD I VY I 7LPLYHSSAA ILG ISGCVE 280 
LGATC VLKKKFSASQFWSQCKK YDVTV'FQY I GELCRYLCK 320 
QSKREGEKDKKVRLAIGNG IRSDVWREFLDRFGN TKVCEL 360 
YAATESSISFMNYTGRIGAIGRTNLFYKLLSTFDLIKYDF 400 

410 420 430 440 

i. i ill i t t > 1 * i > t I t t i r 1 , r i i I i t \ t I t * t t j 

QKDEPMRNEQGWC I KVKKGEPGLL i SRVNAKNFFFGYAGP 440 
YKHTKOKLLCO VFKKGOVYLNTGOL I VQQGONFL YFWCRT 480 
GDTFRWKGENVATTEVADV IGMLDF I QEANVYGVA 1 SG YE 520 
GRAGMASI ILKPNTSLDLEKVYEGVVTFLPAYACPRFLRi 560 
QEKMEATGTFKLLKHQLVEDGFNPLK I SEPLYFM0NLKK3 600 

610 620 630 . 640 

l i t i 1 i t i t I i r i » ) i i i i 1 t i i i ) '_»'_'! i t i * 1 i i t i I 

YVLLTRELYOG IMLGEIKL 619 
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AAGTTCCCACTCCAGACTTCTGCGAGAACCCGTGAGGAAG 40 
CAGCGAGAACCGGGGGTTTGCAAGCCAGAGAAGGATGCGG 80 
ACTCCGGGAGCAGGAACAGCCTCTGTGGCCTCATTGGGGC 120 
TGCTTTGGCTTCTGGGACTTCCGTGGACCTGGAGCGCGGC 160 
GGCGGCGTTCGGTGTGTACGTGGGTAGCGGTGGCTGGCGA 200 
210 220 230 240 

' ' 1 1 1 ' ' ' ' I ' ' ' ' 1 ' ' ' ' I ' ' ' ' t t i . ) \ i i . i 1 , , , , | 

TTTCTGCGTATCGTCTGCAAGACGGCGAC-GCGAGACCTCT 240 
TTGGCCTCTCTGTTCTGATCCGCGTGCGGCTAGAGCTACG 280 
ACGACACCGGCGAGC AGGAGACACGATCCCACGCATCTTC 320 
CAGGCCGTGGCCCAGCGACAGCCGGAGCGCCTGGCGCTGG 360 
TAGATGCGAGTAGCGGTATCTGCTGGACCTTCGCAC AGCT 40O 

' 410 420 . 430 440 
■ ■ * ' 1 < * ' ■ I ■ ' ■ ■ i ' ■ ■ ' I ' ■ ' ■ i ■ ■ ' i i ■ ■ i i i i i i i I . 

AGACACCTACTCCAATGCTGTGGCCAATCTGTTCCTCCAG 440 
CTGGGCTTTGCGCCAGGCGATGTGGTGGCTGTGTTCCTGG 480 
AAGGCCGGCCCGAGTTCGTGGGACTGTGGCTGGGC'CTGGC 520 
CAAGGCCGGTGTAGTGGCTGCGCTTCTCAATGTCAACCTG 560 
AGGCGGGAGCCCCTTGCCTTCTGCTTGGGCACATCAGCTG 600 

610 620 630 640 

' ' ' ' 1 ' ' ' ' 1 ' ' ' ' ' ' ' ' ' I ■ ' ' ' ' ' ' ' ' 1 ' ' < ■ I r . . . I 

CCAAGGCCCTCATTT ATGGCGGGGAGATGGCAGCGGCGGT 640 
GGCGGAGGTGAGTGAGCAGCTGGGGAAGAGCCTGCTCAAG 680 
TTCTGCTCTGGAGATCTGGGGCCTGAGAGCGTCCTGCCTG 720 • 
ACACGCAGCTTCTGGACCCCATGCTTGCTGAGGCGCCCAC 760 
CACACCC'CTGGCAC AGGCCCCAGGCAAGGGCATGGATGAT 800 

810 820 830 840 

' * 1 1 * i * i i I t » j > 1 i t , i I i t , t I i t T , 1 , , , , I , , , t | 

CGGCTATTTTACATCTATACTTCTGGGACCACCGGACTTC 840 
CTAAGGCGGCCATTGTGGTGCACAGCAGGTACTACCGCAT 880 
CGCAGCCTTCGGCC ACCATTCCTACAGCATGCGGGCCAAC 920 
GATGTGCTCTATGACTGCCTACCTCTCTACCACTC AGC AG 960 
GGAACATC ATGGGCGTGGGACAGTGTATCATCTACGGGTT 1000 

1010 1020 1030 1040 
' 1 ' 1 1 ■ ■ ' ' I ' ' ' ■ 1 ■ ' » ■ 1 ■ ' ' • i ■ ■ ■ ■ I ■ ■ i ■ * ■ ■ ■ i I 

AACGGTGGTACTGCGCAAGAAGTTCTCCGCCAGCCGCTTC 1040 
l GGGACGACTGTGTCAAATATAA TTGCACGGTAGTGCAGT 1080 
• ACATCGGTGAAATATGCCGCTACCTGCTAAGGCAGCCGGT 1 120 
TCGCGATGTAGAGCGGCGGCACCGCGTGCGCCTGGCCGTG 1 160 
GGTAACGGAC i GCGGCCAGCCATCTGGGAGGAGTTC ACGC 1200 
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AGGGTTTCGGTGTGCGACAGATTGGCGAGTTCTACGGCGC 1240 
CACCGAATGCAACTGCAGCATTGCCAACATGGACGGCAAG 1280 
GTCGGCTCCTGCGGCTTCAACAGCCGTATCCTCACGCATG 1320 
TGTACCCCATCCGTCTGGTCAAGGTCAACGAGGACACGAT 1360 
GGAGCCACTGAGGGACTCCCAAGGCCTCTGCATCCCGTGC 1400 

1410 1420 1430 1440 

:i I i i 1 ■ i t i I i ■ i i I i i i > I .iiil i ■ ■ ■ I i i r i I i i i i 1 

CAGCCCGGGGAACCTGGGCTTCTCGTGGGCCAGATCAACC 1440 
AGCAAGACCCTCTGCGGCGCTTCGATGGCTATGTTAGTGA 1480 
CAGCGC.CACCAACAAGAAGATTGCCCACAGCGTGTTCCGA 1520 
AAGGGGGACAGCGCCTACCTTTCAGGTGACGTGCTAGTGA 1560 
TGGACGAGCTGGGGTACATGTACTTCCGTGACCGCAGCGG 1600 

1610 1620 1630 1640 

' ' ' ' I i i i i I ii ill i i i i I r l i i I i i i i I < i i i I i i i i I 

GGATACCTTCCGATGGCGCGGCGAGAACGTATCCACCACG 1640 
GAGGTGGAAGCCGTGCTGAGCCGCCTGTTGGGCCAGACGG 1680 
ACGTGGCTGTGTATGGAGTGGCTGTGCCAGGAGTGGAGGG 1720 
GAAAAGCGGCATGGCGGCCATTGCAGACCCCCACAACCAG 1760 
CTGGACCCTAACTCAATGTACCAGGAATTGCAGAAGGTTC 1800 

1810 1820 1830 1840 

■ ■ ■ ' t ' ' ■ ■ I ■ 1 ■ ■ i ' ■ ' ■ I ■ ' ' ■ i i ■ i ■ l ■ > ■ > t ■ ■ i i I 

TTGCATCCTATGCCCAGCCCATCTTCCTGCGTCTTCTGCC 1840 
CCAAGTGGATAC AACAGGCACCTTCAAGATCCAGAAGACC 1880 
CGACTACAGCGTGAAGGCTTTGACCCCCGCCAGACCTCAG 1920 
ACCGGCTCTTCTTTCTAGACCTGAAACAGGGACGCTACCT 1960 
ACCCCTGGATGAGAGAGTCCATGCCCGCATCTGCGCAGGC 2000 

2010 2020 2030 2040 

i .i i : 1 i ' i 1 1 ' ' ' ' I ' ' ' ' I ' ' 1 ' I ' ' ■ ■ I i ' ■ ' I ' ■ ■ ■ i 

GACTTCTCACTC TGAGCCTGGTGAGTGGGATGGCCCTGGA 2040 
CTTGTGAGACCAGGGAGCCGGACACCCCTGTTCAGGTGTT 2080 
TCTCCTGCTTGGCCACGTGGCCAGCAGCACCTGTGGGTGC 2120 
AGGAAACTGGAACCTGAGTGGCCGGGTGTCCCTTTCCTAC 2160 
AACCC ACCATGC ACACATCTAGCCTCTGCCTTGGTCTTTT 2200 
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TCTCCATCTCTTTCCTCCGTGCCCAGCAGGAGCCCCACAG 2240 
ACACATTGGCTGCTGTGTCCTGCAGTGGGACCGGTGTCTA 2280 
GGGGTCCATGCTGCAGGCTGTGACCCGCACTGGTGCCC AC 2320 
CTCCCTTCCCCATTGTGCCTTAGGTTCCTCCACTGTGCGC 2360 
CGGTGAAGCAAGTGGGGACCCACATAGCTGTTGTCCCTGC 2400 

2410 2420 2430 2440 
■ ■ ' ' i ' ' i ■ I ' ■ ■ ■ 1 ■ ■ ■ ■ l ■ ■ ■ ■ i ' ■ * ■ I i ■ ■ i i ■ ■ i ■ i 

TGAGGGTTGGTAGCAAATGC ACCCTCATGTCAGCTGGGAG 2440 
ACACATGCAGTCTCCCACTGACCCCCAATCAACTGAAGAT 2480 
ACTGTTTTGTATTATTGTTTTGAGATAGGGTCTCACTGTG 2520 
GAGGCCAAGCTGGCCTCAGGCTCACCACTCTACTGCCTCC. 2560 
GGGCACCAGCCTGCAGTTTGATGACATGTATGCACTATTG 2600 
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TTCTAAGGGTCTTCTGAGTCCCTGCTTTCCCCTCATGTCC 2640 
TAAAACCTTCCAGAACTGACTCTGATCACTTGGATGTAGC 2680 
TAGTGTTGGCCCTGCCCACGTGTGTCAATTCAGGGGTCCC 2720 
CAGGCATCATCTCTGGAGGCCCTAACCTTGGCAAAGCTTG 2760 
GATGTCCTCACATCACAGCAGGAGACCCAGGAAGGTTGCT 2800 
2810 2820 2830 2840 

1 1 1 ' 1 I I I I I I I t 1 I t I I I 1 I t I I I I.I.I r ■ | | | i , , , | 

GTGGTGTCTCTTGGGCACCCCTGGCGGCAGCCGTGGACAT 2840 
GCTTCCCTGCTGTGATAGCCCAAACTGTTGCCTATGACAT 2880 
TTGAGGTCTACCCTTCTGGCTGCCATGGTCCCCATTGAGA 2920 
TCTTTGGTGACTCACCTCAGCCACCAAGCCAGGCCTCTGC 2960 
CTTCCTTCAGCTCTAAGGGCATGAAGGGTGTGGACAGAGC 3000 

3010 3020 3030 3040 
■ 1 ■ ' ' ■ ■ ■ ' I ■ ' ■ ' 1 ■ ' * « 1 ' ■ i ' i ■ ■ ■ ■ ' ■ • ■ » i ■ ■ ■ i i 

AGCCACAGGCTGCCCACAGTCACCCACATGCAAGTG TTAT 3040 
TTCCTTGTTTGTTTTAAAAAAATAAACATGCTGAGCCTTG 3080 
AAAAAAAAAAAAAAAAAA 3098 
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MRTPGAGTASVASLGLLWLLGLPWTWSAAAAFGVYVGSGG 40 
WRFLR I VCKTARRDLFGLSVL I RVRLELRRHRRAGDT I PR 80 
I FQAVAQRQPERLALVDASSG I CWTFAQLDTYSNAVANLF 120 
LQLGFAFGDVVAVF LEGRPEF VGLWLGLAKAG VVAALLNV 1 60 
NLRREPLAFCLGTSAAKAL I YGGEMAAAVAEVSEQLGKSL 200 

210 220 230 240 

I I 1 I I ■ < I I I r i I t I I i I I I I I I I I i I i I I I I I I 1 I I I I I 

LKFCSGDLGPESVLPDTQLLDPMLAEAPTTPLAQAPGKGM 240 
DDRLFY I YTSGTTGLPKAA I VVHSRYYR I AAFGHHSYSMR 280 
ANDVLYOCLPLYHSAGN I MGVGQC I I YGLTVVLRKKFSAS 320 
RFWDDCVKYNCTV VQY I GE4X-RYLLRQP VRDVERRHRVRL 360 
AVGNGLRPA I WEEFTQGFGVRQ I GEFYGATECNCS I ANMD 400 

410 420 430 440 

■ ■ ' ■ ' ■ ■ ■ < I ' ■ 1 ' 1 ' ■ • ■ 1 1 ■ • 1 1 ' ' 1 ' t ' ' ■ ' 1 ■ ■ 1 ■ f 

GKVGSCGFNSR I LTHVYP I RLVK VNEDTMEPLRDSQGLC I 440 
PCQPGEPGLLVGQ I NQQDPLRRFDGYVSDSATNKK I AHSV 480 
FRKGOSAYLSGDVLVMDELGYMYFRDRSGDTFRWRGENVS 520 
TTEVEAVLSRLLGQTDVAV YGVAVPGVEGKSGMAA I ADPH 560 
NQLDPNSMYQELQKVLASYAQP I FLRLLPQVDTTGTFK IQ 600 

610 620 630 640' 

t T T * I till! t t I I 1 lift! t * f 1 I I I I t 1 I 1 I \ I t t I 1 I , 

KTRLQREGFDPRQTSDRLFFLGLKGGR YLPLDERVHAR I C 640 
AGDFSL 646 
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GACACAGTACTGCCGATGTTGGACAGAGGATCGCTTAACA 40 
GAACGAAATCTCAAAACAAATTAACAGGACCCGGTTGCTT 80 
GATTTCCCAAATCAGAAAAGGCTCGAAATGTCTAGAGGGG 120 
CTGACTGATGCAGCGGTGACCCGGACTGGAGACAGTTGGA 160 
CGCGATCATCTCTGGTGCTTTTGTTCAACCTTGAAACCTT 200 

210 220 230 240 
' ■ ' ■ 1 ■ > ' ' 1 ■ ' ' ■ 1 ' ' ■ ■ 1 ' ' ' ' 1 ■ ■ ■ ■ t ■ ■ ■ ■ i ■ i ■ ■ i 

CGCCACAGGAGACTTGCCTGAGCAGAGAAGCAAACGTGGA 240 
GAAACAAAGAGAG ATCTAGCGAAAAGCCTCTGGGACCAAG 280 
GAGGGGAGGTGGGACTCTGGGTTGGCGGTGGCACCTGCTG 320 
CCGGCTATTAATAATAGGGTCGCGATGCGTTTATAAGGTG 360 
TTTGATTAAACAAAGACTCTATGAGAGAAGAATAACTAGC 400 
410 420 430 440 

I i I « 1 I i i > 1 t i r i I i . i i 1 i t-r-i-l i . i i I > i i i I i i . t | 

AACAGCCCCACGTCTGAGTCGTCGCCTCCGACCTTTTTCA 440 
ACGTGGGTTCTTTGGGCCGAGCGTCGTTTGCCGAGAACTA 480 
GATCTCACCTGACCCCAGACGCTGAAAACAAGCGCTGTGG 520 
CATCCTGGGCCACCCAAGCTGACAAGGGCGCGCCCCCTGA 560 
GCACACGAGGTGCCCCACGAGGGGGAGGGACCCACAGCCG 600 

610 620 630 640 

' 1 1 ' I I I I I I III I I t t t I t I I I I I T I I > I I I I t 1 I I t I I 

TCCCGCCCGCACCGCGGTGTCCGCTGCGGGCACCTGCAGC 640 
CGAGCCGCC ACCCGCAGTCGCAGCGCGTCCGGCGGCCGAA 680 
CCCGGTCGTCAGCTCGTCAGCACCTGCTCTGCTTCTCTCC 720 
CGCCCGCCGCCGCGCTGCACGCCTCGAGCGCTCCCTCGGC 760 
CCCGGCGGGGACCGGGGACCCCGCAGCCACCGCCATGCTG 800 

810 820 830 840 
■ ' > ' 1 ' ■ ' ' I ■ ' ' ' i ■ '• ■ ' I ' ■ ■ ■ i ' ■ ■ t I ' ■ ' 1 1 * 1 ■ 1 i 

CCTGTGCTCTAC ACCGGCCTGGCGGGGCTGCTGCTGCTGC 840 
CTCTGCTGCTCACCTGCTGCTGCCCCTACCTCCTCCAGGA 880 
CGTGCGGT7CTTCCTGCAACTGGCCAACATGGCCCGGCAG 920 
GTGCGCAGCTACCGGCAGCGGCGACCCGTGCGCACCATCC 960 
TGCATGTCTTCTTGGAGCAAGCGCGCAAGACCCCGCACAA 1000 

1010 1020 1030 1040 
' 1 ' ■ 1 ' ■ ' ■ t ' ■ ■ ' 1 ' ■ ■ ■ l ■ ■ ■ ■ i > ' ■ ■ l ■ i ■ i i ■ i ■ ■ I 

GCCCTTCCTGCTGTTTCGCGACGAGACGCTTACCTACGCC 1040 
CAGG l AGACCGGCGCAGCAACCAAGTAGCGCGAGCGCTGC 1080 
ATGATCACCTGGGCCTGCGGCAGGGGGATTGCGTGGCCCT 1 120 
CTTCATGGGCAATGAGCCGGCCTACGTGTGGCTCTGGCTG 1 160 
GGACTGC7CAAACTGGGCTGTCCCATGGCGTGCCTCAACT 1200 
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ACAACATCCGTGCCAAGTCTCTGCTACACTGCTTTCAGTG 1240 
CTGCGGGGCGAAGGTGCTGCTGGCCTCCCCAGAGCTACAC 1280 
GAAGCTGTCGAGGAGGTTCTTCCAACCCTGAAAAAGGAGG 1320 
GCGTGTCCGTCTTCTACGTAAGCAGAACTTCTAACACTAA 1360 
TGGCGTGGACACAGTACTGGACAAAGTAGACGGGGTGTCG 1400 
1410 1420 1430 1440 

■ * > t I ! 1 T I T I I I T I f I I t I I t I l t I t t I t 1 t t t t I i « i . I 

GCGGACCCCATCCCGGAGTCGTGGAGGTCTGAAGTCACGT 1440 
TCACC ACACCCGC AGTCTACATATATACTTCGGGCACCAC 1480 
AGGTCTTCCAAAGGCTGCAACCATTAATCACCATCGCCTC 1520 
TGGTATGGGACCAGCCTTGCCCTGAGGTCCGGAATTAAGG 1560 
CTCATGACGTCATCTACACCACCATGCCCCTGTACCACAG 1600 
1610 1620 1630 1640 

' ' ' ' I ' ' ' ' I ' ' ' ' I ' ' ' ' I ' ' ' ' ' ' T I I 1 » I 1 . I l' I , . I 

CGCGGCGCTCATGATTGGCCTCCACGGATGCATTGTGGTT 1640 
GGGGCTACATTTGCTTTGCGGAGCAAATTTTCAGCCAGCC 1680 
AGTl i l GGGACGACTGCAGGAAATACAACGCCACTGTCAT 1720 
TCAGTAC ATCGGTGAACTGCTTCGGTACCTCTGCAACACG 1760 
CCCCAGAAACCAAATGACCGGGACCACAAAGTGAAAATAG 1800 

1810 1820 1830 1840 
■ ■ ' ■ I ' ■ ■ ' * ' ' ' ■ i ' '■ ■ ■ l ■ i ' i i i i i i i i i ■ > i i t t i l 

CACTAGGAAATGGCTTACGAGGAGATGTGTGGAGAGAGTT 1840 
CATCAAGAGATTTGGGGACATTCACATTTATGAGTTCTAC 1880 
GCTTCCACTGAAGGCAACATTGGATTTATGAACTATCCAA 1920 
GAAAAATCGGAGCTGTTGGAAGAGAAAATTACCTACAAAA 1960 
AAAAGTTGTAAGGCACGAGCTGATCAAGTA7GACGTGGAG 2000 

2010 2020 2030 2040 

I I I I I — 1 I 1 I J ' ' ' ' I i-iil i t i i I i r i i I i i t i I i r i , I 

AAGGATGAGCCTGTCCGTGATGCAAATGGATATTGCATCA 2040 

AAGTCCCCAAAGGAGAGGTTGGAC TCTTGATTTGCAAAAT 2080 

CACAGAGCTCACACCATTTTTTGGCTATGCTGGAGGAAAG 2120 

ACCCAGACAGAGAAGAAAAAGCTCAGAGATGT7TTTAAGA 2160 

AAGGAGACGTCTACTTCAACAGTGGCGATCTCCTGATGAT 2200 
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CGACCGTGAAAATTTCATCTATTTTCACGACAGAGTTGGA 2240 
GACACCTTCCGGTGGAAAGGAGAGAATGTAGCTACCACGG 2280 
AAGTCGCTGAC ATTGTGGGACTGGTAGATTTTGTTGAAGA 2320 
AGTGAATGTTTACGGTGTGCCCGTGCCAGGTC ATGAAGGT 2360 
CGCATCGGGATGGCCTCGATCAAGATGAAAGAAAACTACG 2400 

2410 2420 2430 2440 

f } ♦ t ! t } f * I M t ; , * t * I i t > i I t i t t I « i t i 1 | ( f t 1 t j t | _ t 

AGTTC AATGGAAAGAAACTCTTTCAGCACATCTCGGAGTA 2440 
CCTGCCCAG.TTACTCGAGGCCTCGGTTCCTGAGAATACAA 2480 
GATACCATTGAGATCACCGGGACTTTTAAACACCGCAAAG 2520 
TGACCCTGATGGAAGAGGGCTTTAACCCCTCAGTCATCAA 2560 
AGATACCTTG T.ATTTC ATGGATGACACAGAAAAAACATAC 2600 



FIG. 58C 



2610 2620 2630 2640 

• » ■ ■ i ■ ' ' i I ■ ' ■ ■ i ■ ■ ■ * i ■ * ■ 1 * 1 ■ ■ ■ I ■ ■ ■ ' i 1 ' ' 1 I 

GTGCCCATGACTGAGGAC ATTTATAATGCCATAATTGATA 2640 
AGACTCTGAAGCTCTGAATGTTGCCTGGCTCCTAACACTT 2680 
CCAGAAAGAAAC ACAATAGGCCTAGCATAGCCCCTTCACA 2720 
•TGTGTAATCC AACTTTAACTTGATTAAAGGTTATAGGTGT 2760 
GATTTTTCCTAGGAAATTATTCATTTAAAGGACAATTGTT 2800 

2810 2820 2830 2840 

_ t I 1 1 I 1 1 1 * 1 « 1 t I I tfttttttlj tlTlf « I T I f 1 ' I f ^ 

TGTTTGTTTGT7TGTTTTTTATTAATTAC ACCAGAACGTT 2840 
TGCAAGTAAAAAGATTTAAAGTCACTTATTTTTCAATGTG 2880 
CACCTGCCATTTGTCCTTGCAAACTTAGCTTCTTGGAGAG 2920 
AGGGCCTTATTTTTTTAAAGAC ATAATAAACTATGTAAAC 2960 
ACT 2963 
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MLPVLYTGLAGLLLLPLLLTCCCPYLLQDVRFFLQLANMA 40 
RQVRSYRQRRP VRT I LHVFLEQARKTPKKPFLLFRDETLT 80 
YAQVDRRSNQVARALHDHLGLRQGDCVALFMGNEP AYVWL 1 20 
WLGLLKLGCPMACLNYN I RAKSLLHCFQCCGAK VLLASPE 160 
LHEAVEEVLPTLKKEGVSVFYVSRTSNTNGVDTVLDKVOG 200 

210 220 230 240 

■ ■ ■ 1 ' ■ ' ■ I ■ ' ■ ■ 1 ■ ■ » ■ I ' ■ ■ ■ i ■ ■ ' ■ I ■ « ■ < i ' ■ > ■ 1 ■ 

VSADP I PESWRSEVTFTTPAVY I YTSGTTGLPKAAT I NHH 240 
RLWYGTSLALRSG I KAHDV I YTTMPLYHSAALM I GLKGC I 280 
VVGATFALRSKFSASQFWDDCRK YNATV IQYIGELLRYLC 320 
NTPQKPNDRDHKVK I ALGNGLRGD VWR-E-F I KRFGD I K I YE 360 
FYASTEGN I GFMNYPRK I GAVGRENYLQKK V VRKEL I KYD 400 

410 420 430 440 

1 1 1 * ^ * i t i t t i i t t i t i i 1 t > t t ! > i i t T t > i t I t > t > I 

VEKDEPVRDANGYCIKVPKGEVGLLICK ITELTPFFGYAG 440 
GKTQTEKKKLRDVFKKGDVYFNSGDLLM I DRENF I YFHDR 480 
VGDTFRWKGENVATTEVADI VGLVDFVEEVNVYGVPVFGH 520 
EGR ! GMAS I KMKENYEFNGKKLFOH I SEYLPSYSRPRFLR 560 
IQDT IEITGTFKHRKVTLMEEGFNPSVIKDTLYFMDDTEK 600 

610 620 630 640 

.„ j i t I i * t 1 I i i t » ! * , i t , * 1 » ; t t \ » t | f I i t » > I < t i ) I 

TYVPMTEDI YNAi IDKTLKL 620 
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GATCAGCTCTTCTATATCTACACGTCGGGCACCACGGGGC 40 
TACCCAAAGCTGCCATTGTGGTGCACAGC AGGTATTACCG 80 
AATGGCTGCCCTGGTGTACTATGGATTCCGCATGCGGCCT 120 
GATGACATTGTCTATGACTGCCTCCCCCTCTACCACTCAG 160 
CAGGAAACATTGTGGGGATTGGCCAGTGCGTACTCCACGG 200 

210 220 230 240 

» f t T t t t I T I lift I > t I I ! I I I t I 1 1 1 1 I I I I I ! till! 

CATGACTGTGGTGATCCGGAAGAAGTTTTCAGCCTCCCGG 240 
TTCTGGGATGACTGTATCAAGTACAACTGCACAATTGTAC 280 
AGTACATTGGTGAGCTTTGCCGCTACCTCCTGAACCAGCC 320 
ACCCCGTGAGGCTGAGTCTCGGCACAAGGTGCGCATGGCA 360 
CTGGGCAACGGTCTCCGGCAGTCCATCTGGACCGACTTCT 400 

410 420 430 440 

1*1*1 1 t I t 1 ! 1 1 1 1 trill T T I f I tilt! . > 1 . I I I 1 t 1 

CCAGCCGTTTCCAC ATTCCCAAGGTGGCCGAGTTCTACGG 440 
GGCCACCGAGTGCAACTGTAGCTTGGGCAACTTTGACAGC 480 
CAGGTGGGGGCCTGTGGCTTCAATAGCCGCATCCTGTCCT 520 
TTGTGTACCCCATCCGCTTGGTACGAGTCAATGAGGATAC 560 
CATGGAACTGATCCGGGGACCCGATGGCGTCTGCATTCCC 600 

610 620 630 640 

< t I 1 1 I T f 1 I 1 ! I 1 I > I t t I lift! 1 f 1 t I t t I I I ) t t I I 

TGTCAACCAGGCC AGCCAGGCCAGCTGGTGGGTCGCATCA 640 
TCCAGCAGGACCCCCTACGCCGTTTTGATGGCTACCTCAA 680 
CCAGGGTGCCAAC AACAAGAAGATTGCTAGTGATGTCTTC 720 
AAG A A AGGG G A CC A AGCCTACCTC AC TGGTG AC GTGCTGG 760 
TGATGGATGAGCTGGGCTACCTGTACTTCCGAGACCGCAC 800 

810 820 830 840 

Till! t 1,, 1 1 ! t t t ? I i , t y 1 i i i t I t t i t I i , t , I T , , i j > 

AGGGGACACGTTCCGCTGGAAAGGGGAGAATGTGTCTACC 840 
ACTGAAGTGGAGGGCACACTCAGCCGCCTGCTTCAGATGG 880 
CAGATGTGGCTGTTTATGGTGTTGAGGTGCCAGGAGCTGA 920 
GGGCCGAGCAGGAATGGCTGCTGTGGCAAGCCCCACTAGC 960 
AACTGTGACCTGGAGAGCTTTGCACAGACCTTGAAAAAGG 1000 

1010 1020 1030 1040 

' * 1 > J i t i t 1 i t i i 1 » i ' t I t i l ■ \ t t t i I i i i , i 

AGCTGCCCCTGTACGCCCGCCCCATCTTCCTCCGCTTCTT 1040 

GCCTGAGCTGCAC AAAACAGGAACCTTCAAGTTCCAGAAG 1080 

ACAGAGTTGCGGAAGGAGGGCTTTGACCCGTCTGTTGTGA 1 120 

AAGACCCACTCTTCTATTTGGATGCCCGGACAGGCTGCTA 1 160 

TGTTGCACTGGACC AAGAGGCCTATACCCGCATCCAGGCA 1200 

FIG. 60A 
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GGCGAGGAGAAGCTGTGATTTCCCCCACATCCCTCTGAGG 1240 

GCCAGAGGATGCTGGATTCAGAGCCCCAGCTTCCACTCCA 1280 

GAAGGGGTCTGGGCAAGGCCAGACCAAAGCTAGCAGGGCC 1320 
CGCACCTTCACCCTAGGTGCTGATCCCCCT 1350 



FIG. 60B 



10 20 30 40 

DQLFY I YTSGTTGLPKAA I VVHSRY YRMAAL V YYGFRiIRP 40 
DDI VYDCLPLYHSAGN I VG ! GQC VLHGMTVV I RKKFSASR 80 
FWDOC I KYNCT I VQY I GELCRYLLNQPPREAESRHKVRMA 120 
LGNGLRQS I WTDFSSRFH I PK VAEFYGATECNCSLGNFDS 160 
QVGACGFNSR I LSFVYP I RLVRVNEDTMEL I RGPDGVC I P 200 

210 220 230 240 

■ ■ * ■ i ' ■ ■ * i ' ' ■ ' ' ■ ' ■ ' I ' ■ ■ ' i ■ ■ ' ' I ' » ■ ■ i > ■ ' ■ l 

CQPGQPGQLVGR I I QQDPLRRFDGYLNGGANNKK I ASDVF 240 
KKGDGAYLTGDVLVMDELGYLYFRDRTGDTFRWKGENVST 280 
TEVEGTLSRLLQMADVAVYGVEVPGAEGRAGMAAVASPTS 320 
NCDLESFAQTLKKELPLYARP I FLRFLPEL'riKTGTFKFQK 360 
TELRKEGFDPSV VKDPLFYLDARTGCYVALDQEAYTR I QA 400 

410 420 430 440 

' ' 1 ' 1 > ' ' ' 1 i t i ) ] . i t i 1 > . i » ) t i i t \ » t t i ! i > > > 1 

GEEKL 405 
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ATGCGQGCTCCTGGAGCAGGAACAGCCTCTGTGGCCTC AC 40 
TGGCGCTGCTTTGGTTTCTGGGACTTCCGTGGACCTGGAG 80 
CGCGGCGGCGGCGTTCTGTGTGTACGTGGGTGGCGGCGGC 120 
TGGCGCTTTCTGCGTATCGTCTGCAAGACGGCGAGGCGAG 160 
ACCTCTTTGGCCTCTCTGTTCTGATTCGTGTTCGGCTAGA 200 

210 220 230 240 

< i ■ » i i > ' > I » ' ' ■ i < ■ ' ■ i ■ » ' ' ' ■ ' ■ ' I 1 ■ ' ■ i ■ ' ■ ' I 

GCTGCGACGACACCGGCGAGCAGGAGACACGATCCCGTGC 240 
ATCTTCCAGGCTGTGGCCCGGCGACAACCAGAGCGCCTGG 280 
CACTGGTGGACGCC AGTAGTGGTATATGCTGGACCTTCGC 320 
ACAGCTGGACACCTACTCCAATGCTGTAGCCAACCTGTTC 360 
(r&CCAGCTGGGCTTTGC ACCAGGCGATGTGGTGGCTGTGT 400 

410 420 430 440 

. i i i i i i » i i i i i i i t ' ' ' I ' ■ ' ■ i ' ' * ■ I ' ' ' ■ 1 * ' ' 1 I 

TCCTGGAGGGCCGGCCGGAGTTCGTGGGACTGTGGCTGGG 440 
CCTGGCCAAGGCCGGTGTGGTGGCTGCTCTTCTCAATGTC 480 
AACCTGAGGCGGGAGCCCCTGGCCTTCTGCCTGGGCACAT 520 
C AGCTGCCAAGGCCCTC AITTATGGCGGGGAGATGGCAGC 560 
GGCGGTGGCGGAGGTGAGCGAGCAGCTGGGGAAGAGCCTC 600 

610 620 630 640 

t } I T I 1 t 1 I 1 I t T t j r 1 \ * ! t > * t ! * t T I I l I t l I I I t I 1 

CTCAAGTTCTGCTCTGGAGATCTGGGGCCTGAGAGCATCC 640 
TGCCTGACACGCAGCTCCTGGACCCCATGCTTGCTGAGGC 680 
GCCCACCACACCCCTGGCACAAGCCCCAGGCAAGGGCATG /20 
GATGATCGGCTGTTTTACATCTATACTTCTGGGACCACCG 760 
GGCTTCCTAAGGCTGCCATTGTGGTGCACAGCAGGTACTA 800 

810 820 830 840 

t t . i j I I | | ! T ! I t I t I 1 1 | t 1 t | I I 1 [I I 1 I I T t 1 t I « r 1 

CCGCATTGCTGCCTTTGGCCACCATTCCTACAGCATGCGT 840 

GCCGCCGATGTGCTCTATGACTGCCTGCCACTCTACCACT 880 

CTGCAGGGAACATCATGGGTGTGGGGCAGTGCGTCATCTA 920 

CGGGTTGACGGTGGT ACTGCGC AAGAAGTTCTCCGCC AGC 960 

CGCTTCTGGGATGACTGTGTCAAGTACAATTGCACGG I AG 1000 

1010 1020 1030 104O 

T 1 t t I t T I 1 I t f t T 1 1 t I I 1 It'll » I I ( [ I I t f I I ■ r t I 

TGGATGAC ATAGGTGAAATCTGCCGCTACCTGCTGAGGCA 1040 

GCCGGTTCGCGACGTGGAGCAGCGACACCGCGTGCGCCTG 1080 

GCCGTGGGTAATGGGCTGCGGCCAGCCATCTGGGAGGAGT 1 120 

TCACGCAGCGCTTCGGTGTGCCACAGATCGGCGAGTTCTA 1 160 

CGGCGCTACCGAGTGCAACTGCAGCATTGCCAACATGGAC 1200 

FIG. 62A 
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GGCAAG.GTCGGCTCCTGCGGCTTCAACAGCCGTATCCTCA 1240 

CGCATGTGTACCCCATCCGTCTGGTCAAGGTCAATGAGGA 1280 

CACGATGGAGCCACTGCGGGACTCCGAGGGCCTCTGCATC 1320 

CCGTGCCAGCCCGGGGAACCCGGCCTTCTCGTGGGCCAGA 1360 

TCAACCAGC AGGACCCTCTGCGGCGTTTCGATGGTTATGT 1 400 

1410 1420 1430 1440 

i t t t I i t t i 1 t i f t 1 t f t r 1 i i i t 1 -tii! fill! t f i i I 

TAGTGACAGTGCC ACCAACAAGAAGATTGCCCACAGCGTI 1 440 
TTCCGAAAGGGCGATAGCGCCTACCTCTCAGGTGACGTGC 1 480 
TAGTGATGGACGAGCTGGGCTACATGTATTTCCGTGACCG 1 520 
CAGCGGGGAC ACCTTCCGCTGGCGCGGGGAGAACGTGTCC 1 560 
ACCACGGAGGTGGAAGCCGTGCTGAGCCGCCTACTGGGCC 1 600 

1610 1620 1630 1640 

■i t i i I i i i i I i i t i I i i i i I t i i i I frill i t i i I i i i i 1 

AGACGGACGTGGCTGTGTATGGGGTGGCTGTGCCAGGAGT 1 640 
GGAGGGGAAAGCTGGCATGGC AGCC ATCGC AGATCCCC AC 1 680 
AGCCAGTTGGACCCTAACTCAATGTACCAGGAATTACAGA 1 720 
AGGTTCTTGCATCCTATGCTCGGCCCATCTTCCTGCGTCT 1 760 
TCTGCCCCAGGTGGATACCACAGGCACCTTCAAGATCCAG 1 800 

1810 1820 1830 1840 

T T I t 1 1 t I T 1 T t f t ! t I 1 I 1 1 I I t I 1 ) I 1 I t t T t i T t | I | 

AAGACCCGGCTGC AGCGTGAAGGCITTGACCCCCGTC AGA 1 840 
CCTCAGACAGGC TCTTC T TTC TAG ACC TG A AG TCCGGC AC 1 880 
GAGGTATCTACCCCTGGATGAGAGAGTCCATGCCCGCATT 1 920 
TGCGCAGGCGACTTCTCACTCTGAGCCTGGAGAGTGGGCT 1960 
GGGCCTGGACTCCTGAGACCTGGGAGCCTGACACCCCTCT 2000 

2010 2020 2030 2040 

1 1 I t 1 I I f f 1 (till I t I I I 1 l t f | | | t | 1 l t | l I i t T ? | 

TCGGGTGCTTCTCCTGCCTGGCCACATGGACAGCAGC ACC 2040 
TGTGAGAGTAGGAAAATGGAACCTGAGTGGCTGGGACCCC 2080 
TCTCCTACTTCCCACTATGCATCCATTTTGCCTCTGCCTT 2120 
GATCTTTTTCTCCATCTCTTTTCTCCCTACCCAGCAGGAG 2160 
CCCCACAAACACATGTTGGCTGCTGTGTCCTGCAGTTGGA 2200 
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2230 

1 r. 1 r * t t 1 



2240 



CCAGTGTCCAGGGGTACAGGCTTCAGGCTGTGACCCACAC 2240 
TGGTACCCACCTCCeTTTCCTATTTTGCCTTAGGTTCATC 2280 
CACGGTTCCCCTGTGGAGCAAGT-GGGGGCCCACATAGCTG 2320 
CTGTCCCTGCTGAGGGTTGG.TAGC AATCACACCCTCATGT 2360 
CAGCTGGGAGACACGCGCAGTCTCCCACTGACCCCCAATC 2400 

2410 2420 2430 2440 



AACTGAAAATATTGTTTTGACTACTTTTTGTTTTTTTGTT 2440 
TTTTTGTTTTTTTTTTTTTTCGAGACAGAGTTTCTCTGTA 2480 
TAGCCCTGGCTGTCCTGGAACTCACTTTGTAGACCAGGCT 2520 
GGCCTCGAACTCAAAAATCCTCCTGACTCTGCCTCTGCTT 2560 
CCC A AG TGCTGGGATTAA AG ACGTGCGCC AC CACCGCCTG 2600 




' ' ' * [ 
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GCTGTTTTGTATTTTTGTTTTGTTTTGACGATAGGGTCTC 2640 
ACTGTGGAGGCCAAGCTGGCCTCAGACTCCCCACCCCATT 2680 
.GCCTCTGGGCACC ATTCTATATTCTCAGACTGATGACAAT 2720 
GCACTAGTGTCCCTAGGAGTCTTGAGTCTGCACTTTCCCC 2760 
TCATAGCCTCAAGCTTCC AGAACTGACTCTGATCACTTGG 2800 

2810 2820 2830 2840 

I I I 1 I I » t 1 I ■ t I T 1 I t > t I ■ t I , I ! I T t I I I t I 1 I t I 1 I 

ATGTGGCTAGTGTTGGCTCTACCCACATGTGTCAATTCAG 2840 
GGGTCCCCAGGCATAGTCTCTGGAAGCCCTCACCCGGAAA 2880 
AAGCTTG-GAGAGACCCAGGAAGGTTGTTGTGTTCTCTTGG 2920 
GCACCCCCTGGTGGCAGTCCTGGGCATGCTTCCGCACTGT 2960 
ACTGGTGCATATAGCCCAGACCTATGACATTTGAGGTCTA 3000 

3010 3020 3030 3040 

T 1 > t I I I 1 T 1» T » 1 1 1 t 1 T I 1 I I 1 * 1 t > 1 I I 1 f t 1 1 1 > t I I . 

CCCTTCTGGCTCCTGTGGTCCCCATTGAGATCCTTGGTGA 3040 
CTCACCTCAGTCACCAAGCAGAGCCTCTGCCTGCCTTCAT 3080 
CTTCAAGGTCATGAAGGATGTGGACAGAGCAGCTACAGGC 3120 
TGCCAGCAGTCAACCACA'TGAGAGTGTTACTTCCTTGTTG 3160 
GTTTTTAAAAAATAAATGTGCTGAGCCTCGAAAAAAAAAA 3200 

3210 3220 3230 3240 

1 I I t 1 I I 1 I I 1 1 I t 1 I t I I- ) I I I I I I I I I 1 t I ! I I I I I I I 

AAAAAAAAAAAAAAAAA 3217 
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MRAPGAGTASVASLALLWFLGLPWTWSAAAAFCVYVGGGG 40 
WRFLR I VCKTARRDLFGLSVLIRVRLELRRHRRAGDT I PC 80 
I FQAVARRQPERLALVDASSG ! CWTFAQLDTYSNAVANLF 120 
RQLGFAPGOV VAVFLEGRPEFVGLWLGLAKAGVVAALLNV 160 
NLRREPLAFCLGTSAAKAL I YGGEMAAAVAEVSEQLGKSL 200 



LKFCSGDLGPES I LPDTQLLDPMLAEAPTTPLAQAPGKGM 240 
DDRLFYI'.YTSGTTGLPKAAIVVHSRYYRI AAFGHHSYSMR 280 
AADVLYDCLPLYHSAGN I MGVGQCV I YGLTV VLRKKFSAS 320 
RFWDB€VKYNCTVVDD I GE I CRYLLRQPVRDVEQRHRVRL 360 
AVGNGLRPAIWEEFTQRFGVPQI GEFYGATECNCS I ANMO 400 



GKVGSCGFNSRILTHVYP I RLVK VNEDTMEPLRDSEGLC I 440 
PCQPGEPGLLVGQINGQDPLRRFDGYVSDSATNKKI AHSV .480 
FRKGDSAYLSGDVLVMDELGYMYFRDRSGDTFRWRGENVS 520 
TTEVEAVLSRLLGQTDVAV YGVAVPGVEGKAGMAA I ADPH 560 
SQLOPNSMYQELQK VLAS YARP I FLRLLPGVDTTGTFK I Q 600 



KTRLQREGF DPRQTSDRLFFLDLKSGTRYLPLDERVHAR i 640 
CAGOFSL 647 
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GGGCGGAGGCCGAGCCCAGTCGCCAGCTCCTGCTCTGCTC 40 
CTCTCCCGCCTGCCGCCGCGCTGCACGCCTCGAGCACTCC 80 
CTCGGCCCCGGCGGGGACCGGGGACCCCGCAGCTACCGCC 120 
ATGCTGCCAGTGCTCTACACCGGCCTGGCGGGGCTGCTGC 160 
TGCTGCCTCTGCTGCTCACCTGCTGCTGCCCCTACCTCCT 200 

210 220 230 240 

■ ■ < ■ i ■ ■ ■ ■ l ■ « ' ■ i < ■ ' * I ' ■ ■ ■ 1 ' * ' ■ l ' ' ' ■ 1 ■ ■ * ■ I 

CCAAGATGTGCGGTACTTCCTGCGGCTGGCCAACATGGCC 240 
CGGCGGGTGCGCAGCTACCGGCAGCGGCGACCCGTGCGTA 280 
CCATCCTGCGGGCCTTCCTGGAACAAGCGCGCAAGACCCC 320 
ACACAAGCCCTTCCTGCTGTTCCGAGACGAGACGCTCACC 360 
TACGCCCAGGTGGACCGGCGCAGCAACCAAGTGGCGCGGG 400 

410 420 430 440 • 

■ » » i i » < < » l » ' ' < i ■ ■ ' ■ i ■ » ' ' * ' ' ■ ' l < ■ ■ ' t ■ 1 ' ■ I 

CGCTGCACGATCAACTGGGCCTACGACAGGGGGATTGCGT 440 
AGCCCTCTTCATGGGCAATGAGCCGGCCTACGTGTGGATC 480 
TGGCTGGGACTGCTCAAACTGGGCTGTCCCATGGCGTGCC 520 
TCAACTACAACATTCGTGCCAAGTCTCTGCTGCACTGCTT 560 
TCAATGCTGCGGGGCGAAGGTGCTGCTGGCCTCCCCAGAT 600 

610 620 630 640 
''■■I tiiil t i f i I i ■ i i 1 i ■ i i I i i i i I i i i i l i i i i I 

CTACAAGAAGCTGTGGAGGAGGTTCTTCCAACCCTGAAAA 640 
AGGATGCCGTGTCCGTCTTTTACGTAAGCAGAACTTCTAA 680 
CACAAATGGTGTGGACACAATACTGGAC AAAGTAGACGGA 720 
GTGTCGGCGGAACCCACCCCGGAGTCGTGGAGGTCTGAAG 760 
TCACTTTTACCACGCCAGCAGTATACATTTATACTTCGGG 800 

810 .820 830 840 

t t t i 1 i t i > \ i y * i I t ' » t 1 t i l ? I y t i % I i t i t 1 t ? i t ! 

AACCACAGGTCTTCC AAAAAGCGGAACCATCAATCATCA7 840 
CGCCTAAGGTATGGGACAAGCCTTGCTATGTCGAGTGGGA 880 
ATCACGGCCAAGGATGTCATCTATACCAACAATGCCCCTC- Q20 
TTCCAACAGTGCAACGCTCAAGATCGGCCT7CACGGATGC 960 
ATCCTGGGTTGGGGCTACTTTAACCTTGGCGGGGCAAATT 1000 

1010 1020 1030 1040 

j 1 t I I t t t t i » I 1 1 1 T t < f 1 , 1 I I I ) ■ * | > I i 1*1)1 t I 1 I i — 

CTCAAGCAAGCC AATTTTGGGAACGACTGGC AGGAAATAC 1040 
AACGTCAACGGTCATTCAGTACATTGGTGAACTGCTTCGG 1080 
TACCTGTGCAACACACCGCAGAAACCAAATGACCGGGACC 1 120 
ACAAAGTGAAAAAAGCCC7GGGAAATGGCTTACGAGGAGA 1 160 
TGTGTGGAGAGAGTTCATCAAGAGATTTGGGGAC ATCCAC 120O 

FIG. 64 A 
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GTGTATGAGTTCTACGCATCCACTGAAGGCAACATTGGAT 1240 

TTGTGAACTATCCAAGGAAAATCGGTGCTGTCGGGAGAGC 1280 

AAACTACCTACAAAGAAAAGTTGCAAGGTATGAGCTGATC 1320 

AAGTATGACGTGGAGAAGGACGAGCCGGTCCGTGACGCAA 1360 

ATGGATATTGCATCAAAGTCCCCAAAGGTGAGGTTGGACT 1 400 

1410 1420 1430 1440 

■ ' 1 ' i t i i » I i i i » i i i i i I i i i i i i i t i I , , i i i , . , , | 

CTTGGTTTGCAAAATCACAC AGCTCACACCATTTATTGGC 1440 

TATGCTGGAGGAAAGACCCAGACAGAGAAGAAAAAACTCA 1 480 

GAGATGTCTTTAAGAAAGGCGACATCTACTTCAACAGCGG 1520 

AGACCTCCTGATGATCGACCGTGAGAACTTCGTCTACTTT 1560 

CACGACAGGGTTGGAGATACTTTCCGGTGGAAAGGAGAGA 1600 

1-610 1620 1630 1640 

' ' ' I ' ' ' ' I ' ' ' ' t ' ' ' ' I ' ' ' ' I ' ' ' ' I T I I I 1 I I t ■ I 

ACGTAGCTACCACAGAAGTCGCTGACATCGTGGGACTGGT 1640 

AGATTTTGTTGAAGAAGTGAATGTGTATGGCGTGCCTGTG 1680 

CCAGGTCATGAGGGTCGAATTGGGATGGCCTCCCTCAAGA 1720 

TCAAAGAAAACTACGAGTTCAATGGAAAGAAACTCTTTCA 1760 

ACACATCGCGGAGTACCTGCCC AGTTACGCGAGGCCTCGG 1800 

1810 1820 1830 1840 
-■ ' ' ' i t i > i 1 i>iii i i i i 1 i i . i i iiiii i i i i i i.i.i 

TTCCTGAGGATACAAGATACCATTGAGATC ACTGGGACTT 1840 
TTAAACACCGCAAAGTGACCCTGATGGAAGAGGGCTTCAA 1880 
TCCCACAGTCATCAAAGATACCTTGTATTTCATGGATGAT 1920 
GCAGAGAAAACATTTGTGCCCATGACTGAGAACATTTATA 1960 
ATGCCATAATTGATAAAACTCTGAAGCTCTGAATATTCCC 2000 

2010 2020 2030 2040 
■t i t t I i i i i I i t t i I i i i . t, J ■ ■ > ■ i ■ i ' ' I i ■ i , i i ■ i , I 

TGGTGGTTTAGCTCATGACATTTCCAGAAAGAAACTCGAT 2040 
AGACCTCGCAGAGCCACTTC ATACGTAGAATCCAACTTTA 2080 
ACTTGATTGAAGACTATAAGGTGCGATTTTATTTTTAGGA 2120 
AATTATTCATTAAAAGGATAGTTTTTTTTTTTTTTTTTAA 2160 
TTACACCTGAACCTTTGC AAGTAAAAAGATTTAGAGACAA 2200 

2210 2220 2230 2240 

■ ' ■ ■ 1 * ■ ■ 1 * ' ' ■ ■ 1 ' ' ■ ■ I ■ ' ■ ■ 1 ■ ' ■ ■ I i ' ■ ■ i ■ ■ i ■ i 

TTATTTTTCAATGTGC ACC7GCCATTTGTCCTTGCAAACT 2240 
AAGCTTCTTGGAGAGAGGGCCTTATTTTTTTAAAGACATA 2280 
ATAAACTATATTAACACTAAAAAAAAAAAAAAAAAAAAAA 2320 
AAAAAAAAAAAAAAAAAA 2338 flQ 64B ' 



BNSDOCID: <WO 0121795A3 IA* 



SUBSTITUTE SHEET (RULE 26) 



132/202 



10 



20 





MLP V L YTGL AGLL LLPLL L TC C CP YLLGDVRYFLRLANM A 40 
RRVRSYRQRRP VRT I LRAFLEQARKTPHKPFLLFRDETLT 80 
YAQVDRRSNQVARALHDQLGLRQGDCVALFMGNEPAYVWI 120 
WLGLLKLGCPMACLNYN I RAKSLLHCF GCCGAK VLLASPD 160 
LQEAVEEVLPTLKKDAVSVFYVSRTSNTNGVDT I LDKVDG 200 
210 220 230 240 



VSAEPTPESWRSEVTFTTPAVY I YTSGTTGLPKSGT I NHH 240 
RLRYGTSLAtfSSGNHGQGCHLYQQCPCSNSATLK I GLHGC 280 
I LGWGYFNLGGANSQASQFWERLAGNTTSTV I QY I GELLR 320 
YLCNTPQKf-NDRDKKVKKALGNGLRGDVWREF I KRFGO I H 360 
VYEFYASTEGN I GFVNYPRK I GA VGRANYLQRK VARYEL I 400 

410 420 430 440 



KYDVEKDEPVRDANGYC I K VPKGEVGLLVCK ITQLTPFIG 440 
YAGGKTQTEKKKLRDVFKKGD I YFNSGDLLM IDRENFVYF 480 
HDRVGDTFRWKGENVATTEVAD IVGLVDFVEEVNVYGVPV 520 
PGKEGRIGMASLK I KENYEFNGKKLFQH I AEYLPSYARPR 560 
FLR I QDT I E I TGTFKHRK VTLMEEGFNPTV I KDTLYFMDD 600 





i i i i I 




630 
i • i ■ I i ■ 



640 
' ■ ■ . ■ i 



AEKTFVPMTEN! YNA! IDKTLKL 623 
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GAAAGCTCTGAGAGCGGGTGCAGTCTGGCCTGGCGTCTCG 40 

CGTACCTGGCCCGGGAGCAGCCGACACAC ACCTTCCTCAT 80 

CCACGGCGCGCAGCGCTTTAGCTACGCGGAGGCTGAGCGC 120 

GAGAGCAACCGGATTGCTCGCGCCTTTCTGCGCGCACGGG 160 

GCTGGACCGGGGGCCGCCGAGGCTCGGGCAGGGGCAGCAC 200 

210 220 230 240 

t i i t f iiit! i i t i I i i i i I t t t i 1 i i i i I i t t t I t l r t 1 

TGAGGAAGGCGCACGCGTGGCGCCTCCGGCTGGAGATGCG 240 
GCTGCTAGAGGGACGACCGCGCCCCCTCTGGCACCCGGGG 280 
CGACCGTGGCGCTGCTCCTCCCAGCGGGCCCGGATTTCCT 320 
TTGGATTTGGTTCGGACTGGCCAAAGCTGGCCTGCGCACG 360 
GCCTTTGTGCCCACCGCTTTACGCCGAGGACCCCTGCTGC 400 

410 420 430 - 440 

i i * i 1 ' * i t 1 t t i t I i t i t I ! i i i 1. 1 i i i I 1 i t r I i i i i I 

ACTGCCTCCGCAGCTGCGGTGCGAGTGCGCTCGTGCTGGC 440 
CACAGAGTTCCTGGAGTCCCTGGAGCCGGACCTGCCGGCC 480 
TTGAGAGCCATGGGGCTCCACCTATGGGCGACGGGCCCTG 520 
AAACTAATGTAGCTGGAATCAGCAATTTGCTATCGGAAGC 560 
AGCAGACCAAGTGGATGAGCCAGTGCCGGGGTACCTCTCT 600 

610 620 630 640 

' ■ > ■ i ■ ' ■ ■ 1 ' ■ ' ■ i ' ' ' ■ I ' ■ ' ■ i ■ ' ■ ■ I > ■ ■ ■ i ' ' ■ > I 

GC CC CCC AG AAC AT AATGGACACCTGCCTGTACATCTTC A 640 
CCTCTGGCACTACTGGCCTGCCCAAGGCTGCTCGAATCAG 680 
TCATCTGAAGGTTCTACAGTGCCAGGGATTCTACCATCTG 720 
TGTGGAGTCCACCAGGAGGACGTGATCTACC7CGCACTCC 760 
CACTGTACCACATGTCTGGCTCCCTTCTGGGCATTGTGGG 800 

810 820 830 840 
i ■ ■ ■ i i i i i I i i i i i i i i i I .i. i i i i i i ■ i I ■ i i i i i i i . I 

CTGCTTGGGCATTGGGGCCACCGTGGTGCTGAAACCCAAG 840 
TTCTCAGCTAGCCAGTTCTGGGACGA7TGCCAGAAACACA 880 
GGGTGACAGTGTTCCAGTACATTGGGGAGTTGTGCCGATA 920 
CCTCGTCAACCAGCCCCCGAGCAAGGCAGAGTTTGACCAT 960 
AAGGTGCGCTTGGCAGTGGGCAGTGGGTTGCGCCCAGACA 1000 

1010 1020 1030 1040 

■ ■ ■ ' i < ' ■ ■ I » ■ ■ ' > ' * * * * ' ' ' > i ■ 1 ' ' I ' • ■ ' i ' ' ' ■ I 

CCTGGGAGCGTTTCCTGCGGCGATTTGGACCTCTGCAGAT 1040 
ACTGGAGACGTATGGCATGACAGAGGGCAACGTAGCTACG 1080 
TTCAATTACAC AGGACGGC AGGGTGCAGTGGGGCGAGCTT 1 120 
CCTGGCTTTACAAGCACATCTTCCCCTTCTCCTTGATTCG 1 160 
ATACGATGTCATGACAGGGGAGCCTATTCGGAATGCCCAG 1200 
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GGGCACTGCATGACCACATCTCCAGGTGAGCCAGGCCTAC 1240 
TGGTGGCCCCAGTGAGCC AGCAGTCCCCCTTCCTGGGCTA 1280 
TGCTGGGGCTCCGGAGCTGGCCAAGGACAAGCTGCTGAAG 1320 
GATGTCTTCTGGTCTGGGGACGTTTTCTTCAATACTGGGG 1360 
ACCTCTTGGTCTGTGATGAGCAAGGCTTTCTTCACTTCCA 1 400 

1410 1420 1.430 1440 

m i it i f t I i i t i 1 t j t t I t t i I ( i r f i f i i i > I i t t < 1 t i f i 1 

CGATCGTACTGGAGACACCATCAGGTGGAAGGGAGAGAAT 1440 
GTGGCCACAACTGAAGTGGCTGAGGTCTTGGAGACCCTGG 1 480 
ACTTCCTTCAGGAGGTGAACATCTATGGAGTCACGGTGCC 1520 
AGGGCACGAAGGC AGGGCAGGCATGGCGGCCTTGGCTCTG 1 560 
CGGCCCCCGCAGGCTCTGAACCTGGTGGAGCTCTACAGCC 1600 

1610 1620 1630 1640 

' ' ' ' 1 ' 1 ' ' * i t i i T > i i i I ■ ■ t i I i i i i I ■ i i i I i i i i I 

ATGTTTCTGAGAACTTGCCACCGTATGCCCGACCTCGGTT 1640 
TCTCAGGCTCCAGGAATCTTTGGCCACTACTGAGACCTTC 1680 
AAAC AGCAGAAGGTTAGGATGGCCAATGAGGGCTTTGACC 1720 
CCAGTGTACTGTCTGACCCACTCTATGTTCTGGACCAAGA 1760 
TATAGGGGCCTACCTGCCCCTCACACCTGCCCGGTACAGT 1800 

1810 1820 1830 1840 

■ ■ ■ ■ ' ' ■ ■ ' I ■ ■ ■ ' 1 * ' ' ' I 1 ■ ■ ■ 1 ■ ■ ' ' I ' 1 ' 1 1 ■ ' 1 1 * 

GCCCTCCTGTCTGGAGACCTTCGAATCTGAAACCTTCCAC 1840 
TTGAGGGAGGGGCTCGGAGGGTACAGGCCACCATGGCTGC 1880 
ACCAGGGAGGGTTTTCGGGTATCTTTTGTATATGGAGTCA 1920 
TTATTTTGTAATAAACAGCTGGAGCTTAAAAAAAAAAAAA 1960 
AAAA AAAAAAAAAAAAAA AAAAAAAAAAAA AAAAAAAA 1 998 
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' * ' ' I i i i i I i i t t I t t i t I t t t r I (till i i t i I t i i t I 

ESSESGCSLAWRLAYLAREQP7HTFL I HGAQRFS YAEAER 40 
ESNR I ARAFLRARGWTGGRRGSGRGSTEEGARVAPPAGDA 80 
AARGTTAPPLAPGATVALLLPAGPDFLW I WFGLAKAGLRT 120 
AFVPTALRRGPLLHCLRSCGASALVLATEFLESLEPDLPA 160 
LRAMGLHLWATGPETNVAG I SNLLSEAADQVDEPVPGYLS 200 

210 ■ 220 230 240 

i > * y I t t r i I t i t t I i I r t 1 > t ! i 1 I I i i \ i i t i I t t t I 1 

APQN I MDTCL Y I FTSGTTGLPKAAR I SHLK VLQCQGFYHL 240 
CGVHQEDV I YLALPLYHIiSGSLLG I VGC.LG IGATVVLKPK 280 
FSASQFWDDCQKHRVTVFQY I GELCR YL VNQPPSKAEFDH 320 
KVRLAVGSGLRPDTWERFLRRFGPLGILETYGMTEGNVAT 360 
FNYTGRQGAVGRASWLYKH i FPFSL I RYDVMTGEP I RNAQ 400 

410 420 430 440 

t f t i t t i i i T t t i t I ft i t 1 iiitl i t i i T i > t i I t r i « 1 
GHCMTTSPGEPGLLVAPVSQQSPFLGY AGAPELAKDKLLK 440 
DVFWSGDVFFNTGDLLVCDEQGFLKFHDRTGDT I RWKGEN 480 
VATTEVAEVLETLDFLQEVN ! YGVTVPGHEGRAGMAALAL 520 
RPPQALNLVQLYSHVSENLPPYARPRFLRLQESLATTETF 560 
KQQKVRMANEGFDPSVLSDPLYVLDGDI GAYLPLTPARYS 600 

610 620 630 640 

I t I I 1 I T t T ! .rill * T I 1 1 I t t t 1 > « 1 t 1 I tlll.tll ' 1 

ALLSGDLR I 609 
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r. ' » > I » ' t t | » i t i I i * i t 1 i i * i I t t t t I t i t i I i t t i 1 

ATGCTGCTTGGAGCCTCTCTGGTGGGGGCGCTACTGTTCT 40 
CCAAGCTAGTGCTGAAGCTGCCCTGGACCC'AGGTGGGATT 80 
CTCCCTGTTGCTCCTGTACTTGGGGTCTGGTGGCTGGCGT 120 
TTCATCCGGGTCTTCATCAAGACGGTCAGGAGAGATATCT 160 
TTGGTGGCATGGTGCTCCTGAAGGTGAAGACCAAGGTGCG 200 

210 220 230 240 

* - 1 t t I iittl rt til i f t i 1 i i i r I i i i l 1 i i i > I i > i i I 

ACGGTACCTTCAGGAGCGGAAGACGGTGCCCCTGCTGTTT 240 
GCTTCAATGGTAC AGCGCCACCCGGACAAGACAGCCCTGA 280 
TTTTCGAGGGC AC AGACACTCACTGGACCTTCCGCCAGCT 320 
GGATGAGTACTCCAGTAGTGTGGCCAACTTCCTGCAGGCC 360 
CGGGGCCTGGCCTCAGGCAATGTAGTTGCCCTCTTTATGG 400 

410 420 430 440 

i i i i I i i i l 1 i i f ? I ? t t * I t t f t I i I r I 1 T I t r 1 I > T t I 

.AAAACCGC AATGAGTTTGTGGGTCTGTGGCTAGGCATGGC 440 

CAAGCTGGGCGTGGAGGCGGCTCTCATCAACACCAACCTT 480 

AGGCGGGATGCCCTGCGCCACTGTCTTGACACCTCAAAGG 520 

CACGAGC TC TC ATCTTTGGC AGTGAGATGGCCTCAGCTAT 560 

CTGTGAGATCCATGCTAGCCTGGAGCCCAGACTCAGCCTC 600 

610 620 630 640 

t 1 T t 1 T 1 I 1 1 » 1 T 1 I 1 t T » I » t » T I 1 I T I 1 t t 1 1 1 1 1 T T I 

TTCTGCTCTGGATCCTGGGAGCCCAGC ACAGTGCCCGTC A 640 
GCAC AG AGC ATCTGGACCCTCTTCTGGAAGATGCCCCGAA 680 
GCACCTGC CC AGTC ACCC AGACAAGGGTTTTACAGATAAG 720 
CTCTTCTACATCTACACATCGGGCACCACGGGGCTACCCA 760 
AAGCTGCC ATTGTGGTGC AC AGCAGGTATTATCGTATGGC 800 

810 820 830 840 

I I T I I ' 1 T t I f I I 1 1 i t I ! I 1 1 f t 1 1 ! t I 1 I ! T 1 ! I I I 1 I 

TTCCCTGGTGTACTATGGATTCCGCATGCGGCCTGATGAC 840 
ATTGTCTATGACTGCCTCCCCCTCTACCACTCAAGC AGG A 880 
AACATCGTGGGGATTGGCAGTGCTTACTCCACGGCATGAC 920 
TGTGGTGATCCGGAAGAAGTTCTCAGCCTCCCGGTTCTGG 960 
GATG ATTGTATC AAGTAC AACTGCACAGTGGTACAGT AC A 100O 

1010 1020 1030 1040 

1 i 1 ' 1 ' ' ' ' I ' ' ' ' I t i i ■ I > > » ■ I < ; t i I t i . . I . t t i I 

TTGGCGAGC TCTGCCGCTACCTCCTGAACCAGCCACCCCG 104O 
TGAGGCTGAGTCTCGGCACAAGGTGCGCATGGCACTGGGC 1080 
AACGGTCTCCGGC AGTCC ATCTGGACCGACTTCTCC AGCC 1 120 
GTTTCCACATCCCCCAGGTGGCTGAGTTCTATGGGGCCAC 1 160 
TGAATGC AAC TGTAGCCTGGGCAACTTTGACAGCCGGGTG 1200 
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t t ? i I « i i f I i t i t 1 i i t i | t i i i 1 i t r f I ? t t t I ! t T 1 1 

GGGGCCTGTGGCTTCAATAGCCGCATCCTGTCCTTTGTGT 1240 
ACCCTATCCGTTTGGTACGTGTCAATGAGGATACCATGG A 1280 
ACTGATCCGGGGACCCGATGGAGTCTGCATTCCCTGTCAA 1320 
CCAGGTCAGCCAGGCCAGCTGGTGGGTCGCATCATCCAGC 1360 
AGGACCCTCTGCGCCGTTTCGACGGGTACCTCAACCAGGG 1400 

1410 ' 1420 1430 1440 

■ i > i i i i i t I i ? i t i i i » i I i i t i i i i i i I i i i i I i i ' t I 

TGCCAACAACAAGAAGATTGCTAATGATGTCTTCAAGAAG 1440 
GGGGACCAAGCCTACCTCACTGGTGACGTCCTGGTGATGG 1480 
ATGAGCTGGGTTACCTGTACTTCCGAGATCGCACTGGGGA 1520 
CACGTTCCGCTGGAAAGGGGAGAATGTATCTACCAC I GAG 1560 
GTGGAGGGCAC ACTC AGCCGCCTGCTTC ATATGGCAGATG 1600 

1610 1620 ■ 1630 1640 

i t t t I r t i t 1 r t i f 1 t t i t 1 i i ! t I i i i i I f ' T T 1 T 1 V .1 1. 

TGGCAGTTTATGGTGTTGAGGTGCCAGGAACTGAAGGCCG 1640 
AGCAGGAATGGCTGCCGTTGCAAGTCCC ATCAGCAAC I G I 1680 
GACCTGGAGAGCTTTGC ACAGACCTTGAAAAAGGAGCTGC 1720 
CTCTGTATGCCCGCCCCATCTTCCTGCGCTTCTTGCC IGA 1760 
GCTGCACAAGACAGGGACCTTC AAGTTCCAGAAGACAGAG 1800 

1810 1820 1830 1840 

i ■ l l 1 . I 1 I I ■ ■ t I I ■ I I l I 1 I I ' ! I I I I I 1 I T I 1 T ' 1 1 I 

TTGCGGAAGGAGGGCTTTGACCCATCTGTTGTGAAAGACC 1840 
CGCTGTTCTATCTGGATGCTCGGAAGGGCTGCTACGTTGC 1880 
ACTGGACCAGGAGGCCTATACCCGCATCCAGGCAGGCGAG 1920 
GAGAAGCTGTGATTTCCCCCTACATCCCTCTGAGGGCCAG 1960 
AAGATGCTGGATTCAGAGCCCTAGCGTCCACCCCAGAGGG 2000 



RN<ir>r>ntD- <wn nip-ivo^A-a ia^ 
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, I » t t 1 ' t I 1 I I 1 T ! I I t t t 1 1 f f f 1 I t T T T 1 T t » I I t 1 I ! 1 

TCCTGGQCAATGCCAGACCAAAGCTAGCAGGGCCCGCACC 2040 
TCCGCCCCTAGGTGCTGATCTCCCCTCTCCCAAACTGCCA 2080 
AGTGACTCACTGCCGCTTCCCCGACCCTCCAGAGGCTTTC 2120 
TGTGAAAGTCTCATCCAAGCTGTGTCTTCTGGTCCAGGCG 2160 
TGGCCCCTGGCCCCAGGGTTTCTGATAGGCTCCTTTAGGA 2200 

2210 2220 2230 .2240 

■ I 1 1 T f t T T T f 1 1 T 1 I 1 ■) t 1 I I 1 t 1 I I I T t f I 1 I I 1 1 1 * 1 f 

TGGTATCTTGGGTCCAGCGGGCCAGGGTGTGGGAGAGGAG 2240 
TCACTAAGATCCCTCCAATCAGAAGGGAGCTTAC AAAGGA 2280 
ACC AAGGCAAAGCCTGTAGACTCAGGAAGCTAAG.TGGCCA 2320 
GAGACTATAGTGGCCAGTCATCCCATGTCCACAGAGGATC 2360 
TTGGTCCAGAGCIGCCAAAGTGTCACCTCTCCCTGCCTGC 2400 

2410 2420 2430 2440 
* ' ' ■ i ' ■ ■ ■ I ' ■ ■ ■ i ' ' ' ■ * ' ' ' ' i ■ * < ' l ■ ■ 1 ■ i ' 1 ' ■ 1 

ACCTCTGGGGAAAAGAGGAC AGC ATGTGGCCACTGGGCAC 2440 
CTGTCTCAAGAAGTCAGGATCACACACTCAGTCCTTGTTT 2480 
CTCCAGGTTCCCTTGTTCTTGTCTCGGGGAGGGAGGGACG 2520 
AGTGTCCTGTCTGTCCT'TCCTGCCTGTCTGTGAGTCTGTG 2560 
TTGCTTCTCCATCTGTCCTAGCCTGAGTGTGGGTGGAACA 2600 



FIG. 68C 



2610 2620 2630 2640 

' ' ' ' I 1 T 1 1 1 I , 1 t T 1 I 1 T 1 1 I I I T I I I I I I I t 1 1 I 1 \ ' T 1 

GGC ATGAGGAGAGTGTGGCTCAGGGGCC AATAAACTCTGC 2640 
CTTGACTCCTCTTAAAAAAAAAAAAAAAAAAAAAAAAAAA 2680 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 2710 
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MLLGASL VGALLFSKLVLKLPWTQVGFSLLLLYLGS-GGWR 40 
FIRVFIKTVRRD I FGGM VLLK VKTK VRR YLQERKTVPLLF 80 
ASMVQRHPDKTAL I FEGTDTHWTFRQLDEYSSSVANFLQA 1 20 
RGLASGN V VALFMENRNEF VGLWLGM'A-KLGVEAAL I NTNL 160 
RR0ALRHCLDT3KARAL IFGSEMASA I CE I HASLEPTLSL 200 

210 220 230 240 

I I I ' T I ? I I 1 I I T I 1 1 F T t \ ? I } I I t t I T I I t T f I I I t 1 I 

FCSGSWEPSTVPVSTEHLDPLLEDAPKHLPSHPDKGFTDK 240 
LFYI YTSGTTGLPKAAI VVHSRY YRMASLV YYGFRMRPDD 280 
I VYDCLPLYHSSRKHRGDWQCLLHGMTVV I RKKFSASRFW 320 
DDCIKYNCTVVQY I GELCR YLLNGPPREAESRHK VRMALG 360 
NGLRQSIWTOFSSRFHI PQVAEFYGATECNCSLGNFDSRV 400 

410 420 430 440 

t ' t '. ! t 1 ! t ! T i 1 ? I fiiil i i i t I 'lit! t r t i f j 1 < > 1 

GACGFNSR ILSFVYP i RLVRVNEDTMEL I RGFDGVC IPCQ 440 
PGQPGQL VGR I I QQDPLRRFDGYLNGGA.NNKK I ANOVFKK 480 
GDQAYLTGD VLVMDELG YLYFRDRTGOTFRWKGENVSTTE 520 
VEGTLSRLLHMADVAVYGVEVPGTEGRAGMAAVASP ISNC 560 
DLESFAQTLKKELPLYARP IFLRFLPELHKTGTFKFQKTE 600 

610 620 630 640 

1 ' ' j f » ! L ? 1 ! I T t I ! T I t I I l i T 1 I t t 1 | T I 1 1 I | 1 ! I I 

L R K E G F D P S V V K 0 P ' L F Y L D A R !< G C Y V A L D G E A YTR I GAGE 640 
EKL 643 



FIG. 69 



SUBSTITUTE SHEET (RULE 26) 



140/202 



10 20 30 40 

1 * * 1 ^ > i t t I r t t t 1 i t i i I t f r t 1 i i i i I i i i r t t i f j I 

CACTCATCAGAGCTAAGAGAGACTACACGCTCTCATCTAC 40 
TTCAGAAAGAGCCAATGCCATGGGTATTTGGAAGAAACTA 80 
ACCTTACTGCTGTTGCTGCTTCTGCTGGTTGGCCTGGGGC 120 
AGCCCCCATGGCCAGCAGCTATGGCTCTGGCCCTGCGTTG 160 
GTTCCTGGGAGACCCCACATGCCTTGTGCTGCTTGGCTTG 200 

210 . 220 230 240 

■ ■ ■ ■ 1 ' ■ ■ ■ I ■ ' ■ ■ 1 » ■ ■ ' l ' ■ ■ ■ 1 ■ ■ ■ * I ■ ■ ' ■ * ■ ' ' ■ I 

GCATTGCTGGGCAGACCCTGGATCAGCTCCTGGATGCCCC 240 
ACTGGCTGAGCCTGGTAGGAGCAGCTCTTACCTTATTCCT 280 
ATTGCCTCTACAGCCACCCCCAGGGCTACGCTGGCTGCAT 320 
AAAGATGTGGCTTTCACCTTCAAGATGCTTTTCTATGGCC 360 
TAAAGTTCAGGCGACGCCTTAStTAAACATCCTCCAGAGAC 400 

410 420 430 440 

' ' * * 1 t 1 T 1 I T 1 t t I t 1 T t I } I 1 ) 1 T > > ) I t t t 1 i t A t 1 \ 

CTTTGTGGATGCTTTAGAGCGGCAAGCACTGGCATGGCCT 440 
GACCGGGTGGCCTTGGTGTGTACTGGGTCTGAGGGCTCCT 480 
CAATCACAAATAGCCAGCTGGATGCCAGGTCCTGTCAGGC 520 
AGCATGGGTCCTGAAAGCAAAGCTGAAGGATGCCGTAATC 560 
CAGAACACAAGAG ATGCTGCTGCTATCTTAGTTCTCCCGT 600 

610 620 630 640 

> I * > f t 1 t 1 1 T t t f f 1 ! 1 t 1 t I t I I t t t I I 1 1 t F 1 f I I t I 

CCAAGACCATTTCTGCTTTGAGTGTGTTTCTGGGGTTGGC 640 
CAAGTTGGGCTGCCCTGTGGCCTGGATCAATCCACACAGC 680 
CGAGGGATGCCCTTGCTACACTCTGT ACGGAGCTCTGGGG 720 
CCAGTG7GCTGATTGTGGATCCAGACCTCCAGGAGAACCT 760 
GGAAGAAGTCCTTCCCAAGCTGCTAGCTGAGAACATTCAC 800 

810 820 830 840 

r i t * 1 liltl t t t ' I i t f i 1 l i i i 1 t i > l I t ' i r I lilt! 

TGCTTCTACCTTGGCCACAC-CTCACCCACCCCGGGAGTAG 840 
AGGCTCTGGGAGCTTCCCTGGATGCTGCACCTTCTGACCC 880 
AGTACCTGCCAGCCTTCGAGCTACGATTAAGTGGAAATCT 920 
CCTGCCATATTCATCTTTACTTCAGGGACCACTGGACTCC 960 
CAAAGCCAGCCATCTTATCACATGAGCGGGTCATACAAGT 1000 

1010 1020 1030 1040 

' tr>T TttTi llt^l Itttl .tTll^tTTf ttr.T 

GAGCAACGTGCTGTCCTTCTGTGGATGCAGAGCTGATGAT 1040 

GTGGTCTATGACGTCCTACCTCTGTACCATACGATAGGGC 1080 

TTGTCCTTGGATTCCTTGGCTGCTTACAAGTTGGAGC.CAC 1 120 

CTGTGTCCTGGCCCCCAAGTTCTCTGCCTCCCGATTCTGG 1 160 

GCTGAGTGCCGGCAGCATGGCGTAACAGTGATCTTGTATG 1200 
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■ ' ■ i I tiiil i i i i I i i i i 1 i i i i I l l r i I i i i i I i i t i I 

TGGGTGAAATCCTGCGGTACTTGTGTAACGTCCCTGAGCA 1240 
ACCAGAAGACAAGATACATAC AGTGCGCTTGGCCATGGGA 1280 
ACTGGACTTCGGGCAAATGTGTGGAAAAACTTCC AGCAAC 1320 
GCTTTGGTCCCATTCGGATCTGGGAATTCTACGGATCCAC 1360 
AGAGGGCAATGTGGGCTTAATGAACTATGTGGGCCACTGC 1400 

1410 1420 1430 1440 

i i * f I t t t i 1 t i i > 1 t * i i I i i t t I t t t t I i i t i I i t i r 1 

GGGGCTGTGGGAAGGACC AGCTGCATCCTTCGAATGCTGA 1440 
CTCCCTTTGAGCTTGTAC AGTTCGACATAGAGACAGCAGA 1480 
GCCTCTGAGGGACAAACAGGGTTTTTGCATTCCTGTGGAG 1520 
CCAGGAAAGCCAGGACTTCTT-TTGACCAAGGTTCGAAAGA 1560 
ACCAACCCTTCCTGGGCTACCGTGGTTCCCAGGCCGAGTC 1600 

1610 1620 1630 1640 

TTTlf *ttl 1 ttfll fTltl TTtflllltl | f 1 | 1 1 ' T 1 I 

CAATCGGAAACTTGTTGCGAATGTACGACGCGTAGGAGAC 1640 
CTGTACTTCAACACTGGGGACGTGCTGACCTTGGACCAGG 1680 
AAGGCTTCTTCTACTTTCAAGACCGCCTTGGTGACACCTT 1720 
CCGGTGGAAGGGCGAAAACGTATCTACTGGAGAGGTGGAG 1760 
TGTGTTTTGTCTAGCCTAGACTTCCTAGAGGAAGTCAATG 1800 

1810 1820 1830 1840 

1 t T t I 1 1 1 t I t I t t \ I f T 1 1 ' » t > I t t t I I 1 T ! I 1 1 1 } | | 

TCTATGGTGTGCCTGTGCCAGGGTGTGAGGGTAAGGTTGG 1840 
CATGGCTGCTGTGAAACTGGCTCCTGGGAAGACTTTTGAT 1880 
GGGCAGAAGCTATACC AGCATGTCCGCTCCTGGCTCCCTG 1920 
CCTATGCCACACCTCATTTCATCCGTATCCAGGATTCCCT 1960 
GGAGATCACAAACACCTACAAGCTGGTAAAGTCACGGCTG 2000 

2010 2020 2030 2040 
i * i i i i i t i i i > t t i i r t i i t > i t i > i t > i iiiti iiii^ 

GTGCGTGAGGGTTTTGATG7GGGGATCATTGCTGACCCCC 2040 
TCTACATACTGGACAACAAGGCCCAGACCTTCCGGAGTCT 2080 
GATGCCAGATGTGTACCAGGCTGTGTGTGAAGGAACCTGG 2120 
AATCTCTGACC ACCTAGCCAACTGGAAGGCAATCCAAAAG 2160 
TGTAGAGATTGACACTAGTC AGCTTCAC AAAGTTGTCCGG 2200 

2210 2220 2230 2240 

i i r ' I t i t t I i i i t t t t i t I i > * t I t j i i 1 i i t * I i t t ♦ I 

GTTCC AGATGCCC ATGGCCC AGTAGTACTTAGAGAA7AAA 2240 
CTTGAATGTGTATACAAAAAAAAAAAAAAAAAAAAAA 2277 
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MGIWKKLTLLLLLLLLVGLGQPPWPAAMALALRWFLGDPT 40 
CLVLLGLALLGRPW I SSWMPHWLSLVGAALTLFLLPLQPP 80 
PGLRWLHKDVAFTFKMLFYGLKFRRRLNKHPPETFVDALE 120 
RQALAWPDRVALVCTGSEGSS I TNSQLDARSCQAAWVLKA 160 
KLKDAV I QNTRDAAA I LVLPSKT I SALS VFLGLAKLGCP V 200 

210 220 230 240 



AWI NPHSRGMPLLHSVRSSGASVL I VDPOLQENLEEVLPK 240 
LLAEN I HCFYLGHSSPTPGVEALGASLDAAPSDP VPASLR 280 
AT I KWKSPA I F I FTSGTTGLPKPA ILSHERVIQVSNVLSF 320 
CGCRADDVVYDVLPLYHT IGLVLGFLGCLQVGATCVLAPK 360 
FSASRFWAECRQHGVTV ILYVGEILRYLCNVPEQPEDKIH 400 



TVRLAMGTGLRANVWKNFQQRFGP IR I WEFYGSTEC-N VGL 440 
MNYVGHCGAVGRTSC I LRMLTPFELVQFD I ETAEPLRDKQ 480 
GFC I PVEPGKPGLLLTKVRKNQPFLGYRGSQAESNRKLVA 520 
NVRRVGDLYFNTGDVLTLDGEGFFYFQDRLGDTFRWKGEN 560 
VSTGEVECVLSSLDFLEEVNVYGVPVPGCEGKVGMAAVKL 600 



APGKTFDGQKLYQHVRSWLPAYATPHF I R I QDSLE I TNTY 640 
KLVKSRLVREGFDVG I I ADPLY ILDNKAQTFRSLMPDVYQ 680 







AVCEGTWNL 



689 
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■ ' ' ■ ' ■ ' ■ ' I ■ ■ ■ ' ' ■ ' ' ■ I ■ ■ > ■ i ■ ■ ■ * i ■ ■ < t i ■ ■ ■ . i 

GCTCTCTGGGCCTATATCAAGCTGCTGAGGTACACGAAGC 40 
GCCATGAGCGGCTCAACTACACGGTGGCGGACGTCTTCGA 80 
ACGAAATGTTCAGGCCCATCCGGACAAGGTGGCTGTGGTC 120 
AGTGAGACGCAACGCTGGACCTTCCGTCAGGTGAACGAGC 160 
ATGCGAACAAGGTGGCCAATGTGCTGCAGGCTCAGGGCTA 200 

210 220 230 240 

■ > ■ ' 1 ■ ■ ' ■ l ■ ' ' ■ i ■ ■ * ■ l ■ ' ■ ■ i ■ ■ « t I i i ■ . t i i , , i 

CAAAAAGGGCGATGTGGTGGCCCTGTTGCTGGAGAACCGC 240 
GCCGAGTACGTGGCCACCTGGCTGGGTCTCTCCAAGATCG 280 
GTGTGATCACACCGCTGATCAACACGAATCTGCGCGGTCC 320 
CTCCCTGCTGCACAGCATCACGGTGGCCCATTGCTCGGCT 360 
CTCATTTACGGCGAGGACTTCCTGGAAGCTGTCACCGACG 400 

410 420 430 440 

' ■ ■ ■ 1 ' ■ ' ■ I ' ■ * ' 1 ■ ■ ' ' I » ■ ■ ■ < ' ■ ■ ' I ' ■ ■ ■ i ■ ■ i < l 

TGGCCAAGGATCTGCCAGCGAACCTCACACTCTTCCAGTT 440 
CAACAACGAGAACAACAAC AGCGAGACGGAAAAGAACATA 480 
CCGCAGGCCAAGAATCTGAACGCGCTGCTGACCACGGCCA 520 
GCTATGAGAAGCCTAACAAGACGCAGGTTAACCACCACGA 560 
CAAGCTGGTCTACATCTACACCTCCGGCACCACAGGATTG 600 

610 620 630 640 

T < T t T ftlll I T T t 1 I t 1 t I . . t . 1 I I I I 1 T 1 t t 1 I I I 1 I 

CCAAAGGCTGCGGTTATCTCTCACTCCCGTTATCTGTTTA 640 
TCGCTGCTGGCATCCACTACACCATGGGTTTCCAGGAGGA 680 
GGACATCTTCTACACGCCCTTGCCTTTGTACCACACCGCT 720 
GGTGGCATTATGTGCATGGGTCAGTCGGTGCTCTTTGGCT 750 
CCACGGTCTCCATTCGCAAGAAGTTCTCGGCATCCAACTA 800 

810 820 830 840 
' ' ' ' 1 ' 1 1 ' * ' ' 1 1 > 1 > ■ ' I ■ > ■ ■ i ■ i t t l , i i ■ i ■ i i , l 

TTTCGCCGACTGCGCCAAGTATAATGC AACTATTGGTCAG 840 
TATATCGGTGAGATGGCTCGCTACATTCTAGCTACGAAAC 880 
CCTCGGAATACGACCAGAAAC ACCGAGTGCGTCTGGTCTT 920 
TGGAAACGGACTGCGACCGCAGATTTGGCCACAGTTTGTG 960 
CAGCGCTTC AACATTGCC AAGGTTGGCGAGTTCTACGGCG 1000 

1010 1020 1030 1040 
' ' 1 ' * ' ■ ■ 1 I 1 ' 1 ' 1 1 ■ 1 ' l ■ ■ ' ' 1 ' ' ■ ' I ■ ■ ■ ■ i ■ ■ ■ > i 

CCACCGAGGGTAATGCGAACATCATGAATCATGACAACAC 1040 

GGTGGGCGCCATCGGCTTTGTGTCGCGCATCCTGCCCAAG 1080 

ATCTACCCAATCTCGATCATTCGCGCCGATCCGGACACCG 1 120 

GAGAGCCCATTAGAGATAGGAATGGCCTATGCCAACTGTG 1 160 

CGCTCCCAACGAGCCAGGCGTATTCATCGGCAAGATCGTC 1200 
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AAAGGAAATCCTTCTCGCGAATTCCTCGGATACGTCGATG 1240 
AAAAGGCCTCCGCGAAGAAGATTGTTAAGGATGTGTTCAA 1280 
GCATGGCGATATGGCTTTCATCTCCGGAGATCTGCTGGTT 1320 
GCCGACGAGAAGGGTTATCTGTACTTCAAGGATCGCACCG 1360 
GTGACACCTTCCGCTGGAAGGGCGAGAATGTTTCCACCAG 1400 

1410 1420 1430" 1440 

■ ' ' 1 1 ' ■ ■ ■ l 1 ■ ■ ■ 1 ■ ' ■ » I » ' ■ ' i ■ ■ ■ ■ i ■ ■ i i I i ■ ■ » I 

CGAGGTGGAGGCGCAAGTCAGCAATGTGGCCGGTTACAAG 1440 
GATACCGTCGTTTACGGCGTAACCATTCCGCACACCGAGG 1480 
GAAGGGCCGGCATGGCCGCCATCTATGATCCGGAGCGAGA 1520 
ATTGGACCTCGACGTCTTCGCCGCTAGCTTGGCCAAGGTG 1560 
CTGCCCGCGTACGCTCGTCCCCAGATCATTCGATTGCTCA 1600 

1610 1620 1630 1640 
' ' ' ' l ' ' ' ' I ' ' ' ' I ' ' ' ■ I ■ ' ' ' ) y ■ < ■ 1 i ■ ■ ■ I ' ■ ■ ■ i 

CCAAGGTGGACCTGACTGGAACCTTTAAGCTGCGCAAGGT 1640 
AGACCTGCAGAAGGAGGGCTACGATCCGAACGCGATCAAG 1680 
GACGCGCTGTACTACC-AGACTTCCAAGGGTCGGTACGAGC 1720 
TGCTCACGCCCCAGGTTTACGACCAGGTGCAGCGCAACGA 1760 
AATCCGCTTCTAAGAGCTGCAATAGAGTTGTGTCTGAACC 1800 

1810 1820 1830 1840 

■ ' ' ' 1 1 ' ' ' I ' ■ ' ■ 1 ' ■ ■ ' * » ■ ■ ■ i ■ ' ■ > I i ■ ■ ■ i » i i t l 

TTGCCTTTTGCCCAATATGCTGTTAATTAGTTTGTAAGGC 1840 
TAAGTGTAGTAGAGGAAAATCGGGGGAAATCGGCAGCAAA 1880 
GATCATTCAGCCTAGGAGAGATGCATCCGAAGCACATTTC 1920 
CATGTCAACAATGCACTTTTGTATATCGTAAGCATATATA 1960 
TATCGTATATCGTAAACGTAGTTGTATCTGCATTTGTGTA 2000 

2010 2020 2030 2040 
' ' ' ■ I > ' ' ' I ' ■ ■ * I ' ' ■ ■ I ' ' ' ' t ■ ■ ■ ' I ' ■ ■ ■ i ' » i t l 

GATGATAGCCTCCTATACGCATTTCAATTGTTTTTAGCGT 2040 
GCTAAAGAACCTTGTTAAATGCAATTTCAGCTATTGTTTA 2080 
GTCAGTTTTAGTGGCATTTACACTTCCATTCTCGTTGCGT 2120 
TTCGTTTTTGCCTGTACATATGAGAAGCTCTGATGTTTTT 2160 
G TA TC A AA T AA AG T TT T T TC C T TC ACCACGG AC C AC GTG A 2200 

2210 2220 2230 2240 
? ■ ■ ' 1 t t . < I . i ■ i i .... I ... i i i ... I .... i i . . . f 

AAAAAAAAAAAAAAAAAAAAA 2221 
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ALWAY I KLLRYTKRHERLNYTVADVFERNVQAHPDK VAVV 40 
SETQRWTFRQVNEHANKVANVLQAQGYKKGDVVALLLENR 80 
AEYVATWLGLSK I GV I TPL I NTNLRGPSLLHS I TVAHCSA 120 
L I YGEDFLEAVTDVAKDLPANLTLFQFNNENNNSETEKN I 160 
PQAKNLNALLTTASYEKPNKTQVNHHDKLVY I YTSGTTGL 200 

210 220 230 240 

flftl t f f f I Itlfl Ittll llllf !ttll t f t I I ftltl 

PKAAV I SHSRYLF I AAG I HYTMGFQEED I FYTPLPLYHTA 240 
GGIMCMGQSVLFGSTVSIRKKFSASNYFAOCAKYNATIGQ 280 
Y I GEMARY I LATKPSEYDQKHRVRLVFGNGLRPQ I WPQFV 320 
QRFN I AKVGEFYGATEGNAN I MNHDNTVGA I GFVSR I LPK 360 
IYPISI I RADPDTGEP I RDRNGLCQLCAPNEPGVF I GKI V 400 

410 420 430 440 

T f T 1 I T T I 1 l l 1 1 1 1 . 1 1 I, I I 1 T I T T 1 1 I T„ 1 1 1 t I 1 T I T 1 1 I 

KGNPSREFLGYVDEKASAKK I VKDVFKHGDMAF I SGDLLV 440 
ADEKGYLYFKDRTGOTFRWKGENVSTSEVEAQVSNVAGYK 480 
OTVVYGVTI PHTEGRAGMAA I YDPERELOLDVFAASLA'KV 520 
LPAYARPQ I I RLLTK VDLTGTFKLRKVDLQKEGYOPNA I !< 560 
DALYYQTSKGRYELLTPQVYDQVQRNEIRF 590 
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AGTGTAGATACCAC AGGAACGTTTAAAATCC AGAAGACCA 40 
GACTGCAAAGGGAAGGATACGATCCACGGCTCACAACTGA 80 
CCAGATCTACTTCCTAAACTCCAGAGCAGGGCG I TACGAG 120 
CTTGTCAACGAGGAGCTGTACAATGCATTTGAACAAGGGC 160 

AGGATTTCCCTTT 173 
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SVDTTGTFK I QKTRLQREGYDPRLTTDQ I YFLNSRAGRYE 40 
LVNEELYNAFEGGQDFP 57 
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ATGAAGCTGGAGGAGCTTGTGACAGTTATGCTTCTCACAG 40 
TGGCTGTCATTGCTCAGAATCTTCCGATTGGAGTAATATT 80 
GGCTGGAGTTCTTATTTTATACATCACAGTGGTTCATGGA 120 
GATTTCATTTATAGAAGTTATCTTACGTTGAATAGGGATT 160 
TAACAGGATTGGCTCTAATTATTGAAGTCAAAATCGACCT 200 

210 220 230 240 

» ' ■ ' i ■ ' ' ■ I ' ■ ■ > i ■ ■ ' ■ I ' ' ■ ' i ■ ■ < ■ I ' ■ ■ ■ > ' ' ■ ■ I 

ATGGTGGAGGTTGCATCAGAATAAAGGAATCCATGAACTG 240 
TTTTTGGATATTGTGAAAAAGAATCCAAATAAGCCGGCGA 280 
TGATTGACATCGAGACGAATACAACAGAAACATACGCAGA 320 
GTTCAATGCAC ATTGTAATAGATATGCCAATTATTTCCAG 360. 
GGTCTTGGCTATCGATCCGGAGACGTTGTCGCCTTGTACA 400 

410 420 430 440 

i i t t I i t i t 1 ft I t_ I i i f i 1 i i t . i I i i t i 1 r i i i I tiit! 

TGGAGAACTCGGTCGAGTTTGTGGCCGCGTGGATGGGACT 440 
CGCAAAAATCGGAGTTGTAACGGCTTGGATCAACTCGAAT 480 ■ 
TTGAAAAGAGAGCAACTTGTTCATTGTATCACTGCGAGCA 520 
AGACAAAGGCGATTATC AC AAGTGTAACACTTCAGAATAT 560 
TATGCTTGATGCTATCGATCAGAAGCTGTTTGATGTTGAG 600 

610 620 630 640 

i t i i f » t » t f i t i t 1 t 1 I i f I i t t I 1 I j_ i f i i t i I i i i t I 

GGAATTGAGGTTTACTCTGTCGGAGAGCCCAAGAAGAATT 640 
CTGGATTC AAGAATCTC AAGAAGAAGTTGGATGCTC AAAT 680 
TACTACGGAACCAAAGACCCTTGACATAGTAGATTTTAAA 720 
AGTATTCTTTGCTTCATCTATACAAGTGGTACTACTGGAA 760 
TGCCAAAAGCCGCTGTCATGAAGCACTTCAGATATTACTC 800 

8 10 820 830 840 

■ ■ ' ' i ■ ■ ■ ' I ■ ■ ■ ' i ■ ' ■ ■ I ■ ■ ' 1 1 ' ■ ■ ■ I 1 ■ ■ ' i ' ' ' * * 

GATTGCCGTTGGAGCCGCAAAATCATTCGGAATCCGCCCT 840 
TCTGATCGTATGTACGTCTCGATGCCAATTTATCACACTG 880 
CAGCTGGAATTCTTGGAGTTGGGCAAGCTCTGTTGGGTGG 920 
ATCATCGTGTGTCATTAGAAAAAAATTCTCGGCTAGCAAC 960 
TTTTGGAGGGATTGTGTAAAGTATGATTGTACAGTTTCAC 1000 

1010 1020 1030 1040 

> ■ ' ' i < ■ ' ' I i t » i i i < i . I . i i i i • < . < I . » . . i i i . . I 

AATACATTGGAGAGATTTGTCGGTACTTGTTGGCTCAGCC 1040 
AGTTGTGGAAGAGGAATCCAGGCATAGAATGAGATTGTTG 1080 
GTTGGAAACGGACTCCGTGCTGAAATCTGGCAACCATTTG 1 120 
TAGATCGATTCCGTGTCAGAATTGGAGAACTTTATGGTTC 1 160 
AACTGAAGGAACTTCATCTCTCGTGAACATTGACGGACAT 1200 
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GTCGGAGCTTGCGGATTCTTGCCAATATCCCCATTAACAA 1240 
AGAAAATGCATCCGGTTCGATTAATTAAGGTTGATGATGT 1280 
CACTGGAGAAGCAATCCGAACTTCCGATGGACTTTGCATT 1320 
GCATGTAATCCAGGAGAGTCTGGAGCAATGGTGTCGACGA 1360 
TCAGAAAAAATAATCC ATTATTGCAATTCGAGGGATATCT 1 400 

1410 1420 1430 1440 ■ 

■ * ' * i i i i i I t i i ■ i i i i i I i i i i i i i i t I i i i i i 111*1 

GAATAAGAAGGAAACGAATAAAAAGATTATCAGAGATGTC 1 440 
TTCGCAAAGGGAGATAGTTGCTTTTTGACTGGAGATCTTC 1 480 
TTCAT.TGGGATCGTCTTGGTTATGTATATTTCAAGGATCG 1520 
TACTGGAGATACTTTCCGTTGGAAGGGAGAGAATGTGTCG 1560 
ACTACTGAAGTCGAGGCAATTCTTCATCCAATTACTGGAT 1600 

1610 1620 1630 1640" 

■ ■ ■ ' 1 ' ■ ■ ■ l * ' ' * 1 ' * ■ ■ I ■ ■ ' ■ i ' ■ » ' I ■ ■ ■ ■ i ' ■ ■ ■ I 

TGTCTGATGC AACTGTTTATGGTGTAGAGGTTCCTCAAAG 1640 
AGAGGGAAGAGTTGGAATGGCGTCAGTTGTTCGAGTTGTA 1680 
TCGCATGAGGAAGATGAAACTCAATTTGTTCATAGAGTTG 1720 
GAGCAAGACTTGCCTCTTCGCTTACCAGCTACGCGATTCC 1760 
TCAGTTTATGCGAATTTGTCAGGATGTTGAGAAAACAGGT 1800 

1810 1820 1830 1840 

* I I t I 1 I I I 1 I I » r I I 1 I I I T I T i 1 I i i i I i « i i 1 i t i i I 

ACATTCAAACTTGTGAAGACGAATCTACAACGATTAGGTA 1840 
TCATGGATGCTCCTTC AGATTCAATTTACATCTACAATTC 1880 
TGAAAATCGCAATTTTGTGCCGTTCGACAATGATTTGAGG 1920 
TGCAAGGTCTCACTGGGAAGTTATCCATTTTAA 1953 
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MKLEELVTVMLLTVAV I AQNLP IGV I LAGVL.ILY I TVVHG 40 
DF I YRSYLTLNRDLTGLAL I I EVK I DLWWRLHQNKG I HEL 80 
FLDIVKKNPNKPAMIDIETNTTETYAEFNAHCNRYANYFQ 120 
GLGYRSGDVVALYMENSVEF VAAWMGLAK I GVVTAW I NSN 160 
LKREQLVHC ITASKTKAI I TSVTLQN I MLDA I OQKLFDVE 200 

210 220 230 240 

titil i i i i I t i i i I i i t i I i i i i I i i i i I t t t i I i i t i 1 

GI EVYSVGEPKKNSGFKNLKKKLDAQ I TTEPKTLD I VDFK 240 
SILCFIYTSGTTGMPKAAVMKHFRYYSI AVGAAKSFGIRP 280 
SDRMYVSMP I YHTAAG I LGVGQALLGGSSCV_LRK'<FSASN 320 
FWRDCVKYDCTVSQY 1 GE I CR YLLAQPV VEEESRHRMRLL 360 
VGNGLRAE I WQPFVDRFRVR IGELYGSTEGTSSLVNIDGH 400 

410 420 430 440 
■ i i » i ■ ■ ■ i I i i i i i i t ■ i I i i i i l i i i t 1 < i i i I i i i i 1 . 

VGACGFLP ISPLTKKMHPVRLIKVDDVTGEAIRTSDGLC i 440 

ACNPGESGAMVST I RKNNPLLCFEGYLNKKETNKK ! IRDV 480 

FAKGDSCFLTGDLLHWDRLGY VYFKDRTGDTFRWKGENVS 520 

TTEVEA I LHP I TGLSDATV YGVEVPQREGRVGMASVVRVV 560 

SHEEOETQFVHRVGARLASSLTSYA I PQFMRICQDVEKTG 600 

610- 620 630 640 

I ■ ■ I 1 ' ■ ' ■ I ■ ' ' ' ' t ■ t I I I ■ 1 I I I I I I I ! ■ I I I 1 1 I 1 I 

TFKLVKTNLQRLGIMDAPSDSIYI YNSENRNFVPFDNOLR 640 
CKVSLGSYPF 650 
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ATGAGGGAAATGCCGGACAGTCCCAAGTTTGCGTTAGTCA 40 
CGTTTGTTGTGTATGCAGTGGTTTTGTACAATGTCAACAG 80 
CGTTTTCTGGAAATTTGTATTCATCGGATATGTTGTATTT 120 
AGGCTGCTTCGCACTGATTTTGGAAGAAGAGCACTTGCCA 160 
CGTTACCTAGAGATTTTGCGGGACTGAAGCTCTTAATATC 200 

210 220 230 . 240 

■ < ■ » i ■ ' ■ » I ■ ■ ' ' i ' ■ ■ ■ I ' ■ ■ ■ i ■ ■ ■ ■ i ■ ■ i ■ i t i i i t 

GGTTAAGTCGACAATTCGTGGCTTGTTCAAGAAAGATCGC 240 
CCAATTCATGAAATCTTTTTGAATCAGGTGAAACAGCATC 280 
CAAACAAAGTGGCGATTATTGAAATTGAAAGTGGTAGGCA 320 
GTTGACGTATCAAGAATTGAATGCGTTAGCTAATCAGTAT 360 
GCTAACCTTTACGTGAGTGAAGGTTACAAAATGGGCGACG 400 

410 420 430 . 440 

' * * I I T I I I I Till! I I i t I I I > t ] lit I I t I I I 1 1 t. I. I ! 

TTGTCGCTTTGTTTATGGAAAATAGCATCGACTTCTTTGC 440 
AATTTGGCTGGGACTTTCCAAGATTGGAGTCGTGTCGGCG 480 
TTCATCAACTCAAACTTGAAGTTGGAGCCATTGGCAC ATT 520 
CGATTAATGTTTCGAAGTGCAAATCATGCATTACCAATAT 560 
CAATCTGTTGCCGATGTTCAAAGCCGCTCGTGAAAAGAAT 600 

610 620 630 640 

■ ■ ■ ' i ■ 1 1 ' t ■ ' ■ ■ 1 ■ * ■ ' I ' ■ ■ ' 1 » ■ ■ » i > i < i f ■ ■ ■ ■ i 

CTGATCAGTGACGAGATCCACGTGTTTCTGGCTGGAACTC 640 
AGGTTGATGGACGTCATAGAAGTCTTCAGCAAGATCTCCA 680 
TCTTTTCTCTGAGGATGAACCTCCAGTTATAGACGGACTC 720 
AATTTTAGAAGCGTTCTGTGTTATATTTACACTTCCGGTA 760 
CTACCGGAAATCCAAAGCC AGCCGTCATTAAACACTTCCG 800 

810 820 830 840 
1 ' ' ■ 1 1 ■ 1 ' I ■ ■ ' ' i ■ ' ■ ' I ' . ■ ' ' i ' ' ■ ■ l i ■ ■ » i ■ i ■ ' l 

TTACTTCTGGATTGCGATC-GGAGCAGGAAAAGCATTTGGA 840 
ATTAATAAGTCAGACGTTGTGTACATTACGATGCCAATGT 880 
ATCACTCTGCCGCCGGTATCATGGGTATTGGATCATTAAT 920 
TGCATTCGGGTCGACCGCTGTTATTAGGAAAAAGTTTTCG 960 
GCAAGCAACTTCTGGAAAGATTGCGTCAAGTACAACGTCA 100O 

1010 1020 1030 1040 

' ■ ' ' 1 1 1 ■ ' I ■ ■ ' ' 1 ■ ' ' ■ I ■ ■ ' ■ 1 ' » ' - I ' ' ■ ' i ■ • ■ ' I - 

CAGCGACACAGTACATTGGAGAAATCTGCAGGTATCTTCT 1040 
GGCAGCGAATCCATGTCCTGAAGAGAAACAACACAACGTG 108O 
CGATTGATGTGGGGAAATGGTTTGAGAGGACAAATTTGGA 1 120 
AAGAGTTTGTAGGAAGATTTGGAATTAAGAAAATTGGAGA 1 1 60 
GTTGTACGGCTCAACA.GAAGGAAACTCCAATATTGTTAAC 120O 
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GTGGATAACCATGTTGGAGCTTGTGGATTCATGCCAATTT 1240 
ATCCCCATATTGGATCCCTCTACCCAGTTCGACTTATTAA 1280 
GGTTGATAGAGCCACTGGAGAGCTTGAACGTGATAAGAAC 1320 
GGACTCTGTGTGCCGTGTGTGCCTGGTGAAACTGGGGAAA 1360 
TGGTTGGCGTTATC AAGGAGAAAGATATTCTTCTAAAGTT 1 400 

1410 1420 1430 1440 

■ > ■ ■ i ' ■ ' ' l iiii) < ' ' ■ I ' ■ ' > i » ■ » ' l ■ i i > i i i i i I 

CGAAGGATATGTCAGCGAAGGGGATACTGCAAAGAAAATC 1440 
TACAGAGATGTGTTCAAGCATGGAGATAAGGTGTTTGCAA 1480 
GTGGAGATATTCTTCATTGGGATGATCTTGGATACTTGTA 1520 
CTTTGTGGACCGTTGTGGAGACACTTTCCGTTGGAAAGGG 1560 
GAGAACGTGTCAACTACTGAAGTTGAGGGAATTCTTCAGC 1600 

1610 1620 1630 1640 

t I 1 1 I * » « * l i I 1 t I > 1 I I I 1 » I I 1 1 ! 1 « 1 » I 1 ' I I I » ' I 

CTGTGATGGATGTGGAAGATGCAACTGTTTATGGAGTCAC 1640 
TGTCGGTAAAATGGAGGGGCGTGCCGGAATGGCTGGTATT 1680 
GTCGTCAAGGATGGAACGGATGTTGAGAAAT TCATCGCCG 1720 
ATATTACTTCTCGACTGACCGAAAATCTGGCGTCTTACGC 1760 
AATCCCTGTTTTCATTCGGCTGTGCAAGGAAGTTGATCGA 1800 

1810 1820 1830 1840 

I i \ * | t i i | | \ . ) i „ y , f r t t t 1 I I i i I l I i I I t * i ' I t f t ! \ 

ACCGGAACCTTCAAACTCAAGAAGACTGATCTTCAAAAAC 1840 

AAGGTTACGACCTGGTTGCTTGTAAAGGAGACCCAATTTA 1880 

CTACTGGTCAGCTGCAGAAAAATCCTACAAACCACTGACT 1920 

GACAAAATGCAACAGGATATTGACACTGGTGTTTATGATC 1960 
GCATTTAA 1968 
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MREMPDSPKFALVTF VVYA VVLYNVNSVFWKFVF I GY VVF 40 
RLLRTDFGRRALATLPRDFAGLKLL I SVKST I RGLFKKDR 80 

P I HE I FLNQVKQHPNKVA I I E I ESGRQLTYQELNALANQY 120 
ANLYVSEGYKMGD VVALFMENS I OFF A IWLGLSK IGVVSA 160 
F I NSNLKLEPLAHS I NVSKCKSC I TN I NLLPMFKAAREKN 200 

210 220 230 240 

■ I 1 1 * 1 I T 1 I 1 1 « 1 1 I t I I I I I 1 I I I I 1 t I I I I I .1 I 1 I 1 I I 

L I SOE I h'VFLAGTQVDGRHRSLQQOLHLFSEDEPP V I OGL 240 
NFRSVLCYIYTSGTTGNPKPAVI KHFRYFW I AMGAGKAFG 280 
I NKSDVVY I TMPMYHSAAG I MG I GSL I AFGSTAV I RKKFS 320 
ASNFWKOCV'K YNVTATQY I GE I CR YLLAANPCPEEKQHNV 360 
RLMWGNGLRGQ I WKEFVGRFG I KK I GELYGSTEGNSN I VN 400 

410 420 430 440 

' ■ ' ■ 1 ' ■ ■ ■ i ' ■ ■ ' 1 ' ' ' * I ■ ■ ■ 1 * ' ' ■ ' I ' ■ ' ' 1 ' ' ■ • i 

VDNKVGACGFMP/ YPH IGSLYPVRL IK VDRATGELERDKN 440 
GLCVPCVPGETGEMVGVIKEKDI LLKFEGYVSEGDT AKK I 480 
YRDVFKHGDK VFASGD I LHWDDLGYLYFVDRCGDTFRWKG 520 
ENVSTTEVEG I LGPVMDVEDAT VYGVTVGKMEGRAGMAG I 560 
VVKDGTDVEKF I AD I TSRLTENLASYA I PVF I RLCKEVDR 600 

610 620 630 640 

1 ■ 1 ' 1 iiiiI i i . i I i < i i I i i i i I i i i i I i i i i I i i i ■ I 

TGTFKLKKTDLQKQGYDLVACKGDPI YYWSAAEKSYKPLT 640 
DKMQQO I DTGVYDR I 655 
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ATGGCGTGTATGCATCAGGCTCAGCTATAf AATGATCTAG 40 
AGGAATTGCTAACTGGTCCATCAGTACCCATCGTTGCTGG 80 
AGCTGCTGGAGCTGCAGCTCTCACTGCCTACATTAACGCC 120 
AAATACCACATAGCCCATGATCTCAAGACCCTCGGTGC-TG 160 
GATTGACACAATCGTCCGAAGCGATTGATTTCATAAACCG 200 

210 220 230 240 

' ■ ■ ■ i ' ■ « ■ l ■ ■ ■ ■ i ' ■ ■ ■ I ■ * ■ ■ * ' ' ' ■ t ■ ■ ■ ■ 1 ■ ' ■ ■ I 

CCGCGTCGCACAAAAGCGCGTCCTCACGCACCACATCTTC 240 
CAGGAGCAGGTCCAAAAACAATCAAATCATCCCTTTCTTA 280 
TCTTTGAGGGCAAGACATGGTCTTACAAGGAGTTCTCTGA 320 
GGCATACACGAGGGTCGCGAACTGGCTGATTGATGAGCTG 360 
GACGTACAAGTAGGGGAGATGGTCGCAATTGATGGCGGAA 400 

410 420 430 440 . 

' . ' i ' 1 t i t t I t t i i 1 x t t i 1 i t i t I t t > i I i i 1 « I i i I » 1 

ATAGTGCAGAGCACCTGATGCTTTGGCTTGCACTTGATGC 440 
AATCGGTGCGGCTACGAGTTTTTTGAACTGGAACCTGACA 480 
GGGGCAGGGTTAATTCATTGCATAAAGCTATGCGAATGTC 520 
GATTCGTTATCGCAGACATCGATATTAAAGCGAACATTGA 560 
ACCGTGCCGTGGCGAACTGGAGGAGACGGGCATCAACATT 600 

610 620 630 640 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ' ■ i ' ' ■ ■ I ' ■ < ■ I ' < ' ■ I ■ ■ ■ ' 1 ■ ■ ' ■ I 

CACTACTATGACCCATCCTTCATC7CATCGCTACCGAATA 640 
ACACGCCAATTCCCGACAGCCGCACTGAGAACATTGAATT 680 
AGATTCAGTACGAGGACTGATAtACACATCTGGAACCACT 720 
GGTCTACCTAAAGGCGTGTTTATAAGCACTGGCCGCGAGC 750 
TTAGGACTGACTGGTCGATTTCAAAGTATCTAAATCTCAA 800 

810 .820 830 840 
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GCCCACGGATCGAATGTATACATGTATGCCGCTCTACCAT 840 
GCCGCTGCACACAGCCTCTGTACAGCATCAGTTATTCATG 880 
GTGGAGGTACCGTGGTATTGAGCAGGAAATTCTCACACAA 920 
GAAGTTCTGGCCTGAAGTTGTGGCTTCGGAAGCAAATATC 960 
ATTCAGTACGTTGGTGAATTAGGTCGATATCTCCTGAATG 1000 

1010 1020 1030 1040 
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GTCCAAAGAGTCCTTACGACAGGGCCCATAAAGTCCAGA7 1040 
GGCGTGGGGCAATGGCATGCGTCCAGACGTGTGGGAAGCG 108O 
TTTCGTGAACGCTTCAACATACCAATTATTCATGAGCTCT 1 120 
ATGCCGCAACCGATGGGCTCGGGTCAATGACCAATCGTAA 1 160 
CGCGGGCCCTTTTACAGCAAACTGTATTGCGCTGCGAGGG 120O 
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CTGATCTGGCACTGGAAATTTCGAAATCAGGAAGTGCTGG 1240 
TCAAGATGGATCTCGATACTGATGAGATCATGAGAGATCG 1280 
CAATGGGTTTGCGATACGATGCGCTGTCAATGAACCTGGA 1320 
CAGATGCTTTTTCGGCTGACACCCGAAACTCTGGCTGGTG 1360 
CACCAAGCTACTAC AACAACGAAACGGCCACACAGAGCAG 1400 

1410 1420 1430 1440 

i t i i I i i i i I i i i i \ i * i i I t i i * 1 i t t i I i i < t I I i i i I 

GCGGATTACAGATGTGTTTCAAAAGGGTGACCTGTGGTTC 1440 
AAGTCCGGTGAC ATGCTACGGCAAGACGCCGAAGGCCGCG 1480 
ICTACTTTGTCGATCGACTAGGCGATACGTTCCGCTGGAA 1520 
ATCCGAAAACGTTTCTACCAATGAAGTCGCGGACGTGATG 1560— 
GGCACATTTCCTCAGATTGCTGAAACGAATGTATACGGTG 1600 

1610 1620 1630 1640 

' ' ' ' I i » ■ r 1 i i i r I i t i i I i i i t I i i i t I . i i » I i ■ l < I 

TCCTTGTGCCGGGT AACGATGGTCGAGTGCGCAGCCTC AA 1640 

TTGTCATGGCAGACGGCGTGACAGAGTCGACATTCGCTTC 1680 

GCTGCCCTTGCAAAGCACGCCCGAGATCGGTTACCGGGTT 1720 

ATGCTGTACCACTGTTTCTGAGGGTAACTCCAGCACTTGA 1760 

ATATACGGGCACATTAAAGATTCAGAAAGGACGCCTCAAG 1800 

1810 1820 1830 1840 

» \ t I \ t I I \ I I 1 I 1 t t I 1 1 I t 1 t t I I t T t 1 1 I T I ) t 1 T I 1 

CAGGAAGGTATAGACCCAGATAAGATTTCCGGCGAAGATA 1840 

AGTTATACTGGCTGCCGCCTGGTAGCGATATATATTTACC 1880 

ATTTGGAAAGATGG AGTGGCAGGGAATTGTAGATAAGCGT 1920 
ATACGGCTGTGA 1932 
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■ ■ ■ ' i ■ ■ ■ ■ I ■ ■ ' ' i ■ ' ■ < l ' 1 ■ ■ * ■ ' ■ ■ I ■ ■ ■ ■ 1 ■ ' ' ' I 

MACMHQAQLYNDLEELLTGPSVP I V AG A AG A A ALT AY I NA 40 
KYH I AHDLKTLGGGLTQSSEA [ DF I NRRVAQKRVLTHH I F 80 
QEQVQKQSNHPFL I FEGKTWS YKEFSEAYTRVANWL I DEL 120 
DVQVGEMVA I DGGNSAEHLMLWLALDA I GAATSFLNWNLT 160 
GAGL I HC I KLCECRFV I AD I D I KAN I EPCRGELEETG i N I 200 

210 220 230 240 

<**l- llltti >Tftl ItTll llltl ItTtl I » I t I 1 1 1 I I 

HYYDPSFISSLPNNTPIPDSRTENIELDSVRGLIYTSGTT 240 
GLPKGVF I STGRELRTDWS I SKYLNLKPTORMYTCMPLYH 280 
AAAHSLCTASV I HGGGTVVLSRKFSHKKFWPEVV ASEAN I 320 
IQYVGELGRYLLNGPKSPYDRAHKVQMAWGNGMRPDVWEA 360 
FRERFN I P I I HEL YAATDGLGSMTNRNAGPFTANC I ALRG 400 

410 420 430 440 

■ ' < i i ■ ' ■ ■ l ' ■ ' ■ i ' ■ ' > l ■ ' ' ■ i ' ■ ' < I ■ > ■ ■ i ■ ' ■ ' * 

L I WHWKFRNQEVL VKMDLDTOE I MRDRNGFA I RCAVNEPG 440 
QMLFRLTPETLAG APSYYNNETATQSRR I TDVFQKGDLWF 480 
KSGDMLRQDAEGRVYFVDRLGDTFRWKSENVSTNEVADVM 520 
GTFPQIAETNVYGVLVPGNDGRVRSLNCHGRRRDRVDIRF 560 
AALAKHARDRLPGYAVPLFLRVTPALEYTGTLK I QKGRLK 600 

610 620 630 640 

■ ' ■ ■ i ■ ' ■ ■ I ■ ■ » ■ i ■ ' ■ ' I ■ ■ ' ■ i ' 1 ■ ' I ' ■ ■ ■ I ' 1 ■ ' I 

QEGIDPDK ISGEDKLYWLPPGSDI YLPFGKMEWQGl VQKR 640 
[RL 643 
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1 1 * 1 t i i i t 1 -i i i i 1 i i i t 1 t i i i 1 t i i t I i > t . I i i i i 1 

CTTTACCATTCATCAGCTTCATTCTGCATTTTTAGCTTGA 40 
CGGCAGCCGGGTCTACGCTGATCATCGGCCGCAAGTTCTC 80 
CGCGAGAAACTTCATAAAGGAAGCGCGCGAGAACGACGCC 120 
ACGGTCATCCAGTACGTGGGTGAGACCTTGCGATATCTGC 160 
TCGCCACCCCCGGTGAAACCGATCCAGTTACTGGCGAAGA 200 

210 220 230 240 

■ ■ ' ■ 1 ■ ■ ■ ' i ' ■ ■ > i ■ ■ ■ ■ I ■ ■ ■ ' i ■ ' « ' l ■ ■ ■ ■ > » ■ ■ ■ i 

CCTGGACAAAAAGCACAATATTCGAGCAGTATACGGCAAC 240 
GGGCTACGGCCGGATATCTGGAACCGCTTCAAiiGAGCGCT 280 
TCAACGTGCCGACGGTTGCCGAATTTTATGCTGCAACCGA 320 
GAGCCCAGGCGGAAC ATGGAACTATTCAACAAATGACTTC 360 
ACTGCCGGAGCCATTGGGCACACTGGCGTGCTTAGTGGAT 400 

410 420 430 440 

1 T 1 \ 1 T , t I f 1 1 , t T > 1 1 , 1 I t 1 1 T T 1 1 1 ] I T I 1 1 1 1 T I I 1 T \ ] 

GGCTTCTTGGACGCGGCCTTACTATTGTCGAGGTGGACCA 440 
GGAATCACAGGAACCATGGCGCGATCCCC AAACCGGGTTC 480 
TGCAAGCCGGTCCCGCGAGGCGAAGCAGGCGAGCTCCTGT 520 
ATGCCATTGATCCGGCCGACCCGGGCGAGACCTTCCAGGG 560 
CTACTACCGCAACTCCTTTAGAGCACACTGGCGGCCG 597 



FIG. 82 



10 20 30 40 

■ ■ ■ ■ i ■ ■ ■ ■ I ■ ■ ■ ■ i ■ » ■ ' i ■ ' < ' i ■ ' ■ '- i ■ ' ' ■ i ■ ■ ■ ■ l 

LYHSSASF C I FSLTAAGSTL I I GRKFSARNF I KEARENDA 40 
TV I QYVGETLRYLLATPGETDPVTGEDLDKKKN ! RAVYGN 80 
GLRPDIWNRFKERFNVPTVAEFYAATESPGGTWNYSTNDF 120 
TAGAIGHTGVLSGWLLGRGLTIVEVDQESQEPWRDPQTGF 160 
CKPVPRGEAGELLY A I DPADPGETFQGY YRNSFRAHWRP 199 
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GCAAAGGCCGACGCGTGGCTGCGGACGGGTAACGTGATCA 40 
GGGCGGACAACGAAGGGCGACTCTTCTTCCACGACCGGAT 80 
CGGAGACACGTTCCGATGGAAGGGAGAGACNGTCAGCACA 120 
CAAGAGGTCAGTTTGGTGCTCGGACGACACGACTCAATCA 160 
AGGAGGCCAACGTGTACGGCGTGACGGTGCCGAACCACGA 200 



CGGGCGGGCCGGCTGCGCTGCGCTCACGCTATCAGACGCJ 240 
CTGGCGACTGAAAAGAAGCTGGGCGATGAGCTGCTAAAGG 280- 
GATTGGCTACTCACTCGTCGACTTCGCTTCCCAAGTTTGC 320 
GGTGCCGCAGTTCCTACGGGTGGTGCGCGGCGAGATGCAG 360 
TCAACGGGCACCAACAAGCAACAGAAGCACGACCTGAGGG 400 



TGCAGGGTGTAGAGCCGGGCAAGGTGGGCGTAGACGAGGT 440 
GTACTGGTTGCGGGGAGGGACATATGTACCATTCGGAACA 480 
GAGGATTGGGATGGGTTGAAGAAGGGTCTTGTGAAGTTGT 520 
GA 522 




till 
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AKADAWLRTGNV I RADNEGRLFFHDR I GDTFRWKGE I VST 40 



10 20 30 





EDWDGLKKGLVKL 173 
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ATGTCTCCCATACAGGTTGTTGTCTTTGCCTTGTCAAGGA 40 
TTTTCCTGCTATTATTCAGACTTATCAAGCTAATTATAAC 80 
CCCTATCCAGAAATCACTGGGTTATCTATTTGGTAATTAT 120 
TTTGATGAATTAGACCGTAAATATAGATACAAGGAGGATT 160 
GGTATATTATTCCTTACTTTTTGAAAAGCGTGTTTTGTTA 200 

210 220 230 240 

1 I * I 1 t T t t 1 I t I t ! f T I t 1 t t 1 I 1 ( t t I 1 I t I T 1-1 I t t I 

TATCATTGATGTGAGAAGACATAGGTTTCAAAACTGGTAC 240 
TTATTTATTAAACAGGTCCAACAAAATGGTGACCATTTAG 280 
CGATTAGTTACACCCGTCCCATGGCCGAAAAGGGAGAATT 320 
TCAACTCGAAACCTTTACGTATATTGAAACTTATAACATA 360 
GTGTTGAGATTGTCTC ATATTTTGCATTTTGATTATAACG 400 

410 420 430 440 

1 I I 1 1 I I I t. I » I I I 1 I I > 1 I t I I I I I I I 1 t t I I 1 ! 1 I I 

TTCAGGCCGGTGACTACGTGGCAATCGATTGTACTAATAA 440 
ACCTCTTTTCGTATTTTTATGGCTTTCTTTGTGGAACATT 48.0 
GGGGCTATTCCAGCTTTTTTAAACTATAATACTAAAGGCA 520 
CTCCGCTGGTTCACTCCCTAAAGATTTCCAATATTACGCA 560 
GGTATTTATTGACCCTGATGCCAGTAATCCGATCAGAGAA 600 

610 620 630 640 . 

I I 1 » I I I 1 I I I I I I I I I 1 I I T 1 1 I I I 1 I I 1 1 I I ! t I I 1 T I 

TCGGAAGAAGAAATCAAAAACGCACTTCCTGATGTTAAAT 640 
TAAACTATCTTGAAGAACAAGACTTAATGCATGAACTTTT 680 
AAATTCGCAATCACCGGAATTCTTACAACAAGACAACGTT 720 
AGGAC ACCACTAGGCTTGACCGATTTTAAACCCTCTATGT 760 
TAATTTATACATCTGGAACCACTGGTTTGCCTAAATCCGC 800 

810 820 830 840 

| ] t \ t \ * 1 . t t ) t | I ! t I | T 1 > 1 j 1 T I » | 1 t 1 I 1 It'll ) * 1 l I 

TATTATGTC TTGGAGAAAATCCTCCGTAGGT7GTCAAGTT 840 
TTTGGTC ATGTTTTAC ATATGACTAATGAAAGCACTGTGT 880 
TCACAGCCATGCCATTGTTCCATTCAACTGCTGCCTTATT 920 
AGGTGCGTGCGCCATTCTATCTCACGGTGGTTGCCTTGCG 960 
TTATCGCATAAATTTTCTGCCAG7ACATTTTGGAAGCAAG 1000 

1010 1020 1030 1040 

* > * 1 I I t I I * 1 — I T T till! 1. till tltll ttttl lllll 

TTTATTTAACAGGAGCCACGCACATCCAATATGTCGGAGA 1040 

AGTCTGTAGATACCTGTTACATACGCCAATTTCTAAGTAT 1080 

GAAAAGATGCATAAGGTGAAGGTTGCTTATGGTAACGGGC 1 120 

TGAGACCTGACATCTGGC AGGACTTCAGGAAGAGGTTCAA 1 160 

CATAGAAGTTATTGGTGAATTCTATGCCGCAACTGAAGCT 1200 

FIG. 86A 
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1210 1220 1230 1240 

I t t t I | » t I I 1 I t 1 ] I I t t 1 1 t 1 I 1 T t t t I I 1 t I I I I t t T 

CCTTTTGCTACAACTACCTTCCAGAAAGGTGACTTTGGAA 1240 
TTGGCGC ATGTAGGAACTATGGTACTATAATTCAATGGTT 1280 
TTTGTCATTCC AACAAACATTGGTAAGGATGGACCCAAAT 1320 
GACGATTCCGTTATATATAGAAATTCCAAGGGTTTCTGCG 1360 
AAGTGGCCCCTGTTGGCGAACCAGGAGAAATGTTAATGAG 1400 

1410 1420 1430 1440 

I f I I 1 > * * t I I I I t t t I I T I * I I t 1 T t I 1 I I t t t I I t t T I 

AATCTTTTTCCCT AAAAAACCAGAAACATCTTTTCAAGGT 1440 

TATCTTGGTAATGCCAAGGAAACAAAGTCCAAAGTTGTGA 1 430 

GGGATGTCTTCAGACGTGGCGATGCTTGGTATAGATGTGG 1520 

AGATTTATTAAAAGCGGACGAATATGGATTATGGTATTTC 1 560 

CTTGATAGAATGGGTGATACTTTCAGATGGAAATCTGAAA 1600 

1610 1620 1630 1640 

I I I I I I ■ I I 1 111 I I I I I I I I I I I I I I I I I I ! I I I I I I L.J 

ATGTTTCCACTACTGAAGTAGAAGAT CAGTTGACGGCCAG 1640 
TAACAAAGAAC AATATGCACAAGTTCTAGTTGTTGGTATT 1680 
AAAGTACCTAAATATGAAGGTAGAGCTGGTTTTGCAG l TA 1720 
TTAAACTAACTGACAACTCTCTTGACATC ACTGCAAAGAC 1760 
CAAATTAT-TAAATGATTCCTTGAGCCGGTTAAATCTACCG 1800 

1810 1820 1830 1840 

iitil iiiiI itill ril fl itiil iil il iiii! illll 

TCTTATGCTATGCCCCTATTTGTTAAATTTGTTGATGAAA 1840 
TTAAAATGACAGATAACCTC ATAAAATTTTGA 1872 
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■ ' ■ ■ 1 ' ' ■ ' l ■ ' ■ ' * ■ ■ ■ ' I ■ ' ' ' 1 ' ■ ■ 1 I ' ■ ■ ■ i » ' ■ ■ i 

MSP I QV VVFALSR I FLLLFRL I KL I I TP I QKSLG YLFGNY 40 
FDELDRK YRYKEDWY I IPYFLKSVFCYI I DVRRHRFQNWY 80 
LFIKQVQQNGDHLAI SYTRPMAEKGEFQLETFTY I ETYN I 120 
VLRLSH I LHFDYNVQAGDYVA I DCTNKPLF VFLWLSLWN I 160 
GA I PAFLNYNT KGTPLVHSLK I SN I TQVF I DPD ASNP I RE 200 

210 ' 220 230 240 

't T r I I I I t 1 f I T » I I T I I I 1 t T I I I I I I I 1 ! 1 1 I 1 t 1 I t I 

SEEE I KNALPDVKLNYLEEQDLMHELLNSQSPEFLQQDNV 240 
RTPLGLTDFKPSML IYTSGTTGLPKSAI MSWRKSSVGCQV 280 
FGHVLHMTNESTVFTAMPLFHSTAALLGACA I LSHGGCLA 320 
LSHKFSASTFWKQV YLTGATH IQYVGEVCRYLLHTPISKY 360 
EKMKKVK VAYGNGLRPD I WQOFR'KRFN I EV I GEFYAATEA 400 

410 ' . 420 430 440 

■ 1 ' ■ 1 ■ 1 ■ ■ I ■ ■ ■ ■ * ' ■ ■ * I ' 1 ■ > * ■ ■ ■ ' i ■ ' ■ ' 1 ■ ■ ' ' * 

PFATTTFQKGDFGIGACRNYGT! I QWFLSFQQTLVRMDPN 440 
DDSV I YRNSKG.FCEVAPVGEPGEMLMRIFF PKKPETSFQG 480 
YLGNAKETKSK VVRDVFRRGDAWYRCGDLLKADEYGLWYF 520 
LDRMGDTFRWKSENVSTTEVEDQLTASNKEQYAQVLVVG ! 560 
KVPKYEGRAGFAVIKLTDNSLDITAKTKLLNDSLSRLNLP 6O0 

610 620 630 640 

' t 1 * t t i i t I ; f | 1 | i i i l I I f t t 1 i i i i ' I ) i t l I r t t i I , 

SYAiiPLFVKFVDE i KMTDNL I KF 623 
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i i i i 1 i t * i I t i i t I t i i i I t i t i I i i t t I i i t i I i i i i I 

GTGTCCGATTACTA'CGGCGGCGCACACACAACGGTCAGGC 40 
TGATCGACCTGGCAACTCGGATGCCGCGAGTGTTGGCGGA 80 
CACGCCGGTGATTGTGCGTGGGGCAATGACCGGGCTGCTG 120 
GCCCGGCCGAATTCC AAGGCGTCGATCGGCACGGTGTTCC 160 
AGGACCGGGCCGCTCGCTACGGTGACCGAGTCTTCCTGAA 200 

210 220 230 240 . 

■■Ill t I t I I I t ■ t I . I ■ I I I I 1 I I I 1 I I I I I I I I Mill 

ATTCGGCGATCAGCAGCTGACCTACCGCGACGCTAACGCC 240 
ACCGCCAACCGGTACGCCGCGGTGTTGGCCGCCCGCGGCG 280 
TCGGCCCCGGCGACGTCGTTGGCATCATGTTGCGTAACTC 320 
ACCCAGCACAGTCTTGGCGATGCTGGCCACGGTCAAGTGC 360 
GGCGCTATCGCCGGCATGCTCAACTACCACCAGCGCGGCG 400 

410 420 430 440 

• i ■ ■ i i ■ ■• ■ I i ■ ■ ■ ' ■ ■ ' 1 I i ' ■ ' 1 ■ ■ * ■ i ' ■ ' ■ * ■ ■ ■ ' I 

AGGTGTTGGCGCACAGCCTGGGTCTGCTGGACGCGAAGGT 440 
ACTGATCGCAGAGTCCGACTTGGICAGCGCCGTCGCCGAA 480 
TGCGGCGCCTCGCGCGGCCGGGTAGCGGGCGACGTGCTGA 520 
CCGTCGAGGACGTGGAGCGATTCGCCACAACGGCGCCCGC 560 
CACCAACCCGGCGTCGGCGTCGGCGGTGCAAGCCAAAGAC 600 

610 620 630 640 

■ ■ ■ ' i ' ■ i i I ■ i ■ ■ i i ' i ' * ' i ' ' i ' ■ ■ ■ I ■ ■ ■ ■ i i i i 1 I 

ACCGCGTTCTACATCTTCACCTCGGGCACCACCGGATTTC 640 
CCAAGGCCAGTGTCATGACGCATCATCGGTGGCTGCGGGC 680 
GCTGGCCGTCTTCGGAC-GGATGGGGCTGCGGCTGAAGGGT 720 
TCCGAC ACGCTCTACAGCTGCCTGCCGCTGTACCACAACA 760 
ACGCGTTAACGGTCGCGGTGTCGTCGGTGATCAATTCTGG 800 

810 . 820 830 840 

i t r i ( i * t » 1 ,, t i i \ I ttiil \ i t * t i i \ i 1 } * ^ > 1 * ; j i I 

GGCGACCCTGGCGCTGGGTAAGTCGTTTTCGGCGTCGCGG 840 
TTCTGGGATGAGGTGATTGCCAACCGGGCGACGGCGTTCG 880 
TCTACATCGGCGAAATCTGCCGTTATCTGCTCAACCAGCC 920 
GGCCAAGCCGACCGACC3TGCCCACCAGGTGCGGGTGATC 960 
TGCGGTAACGGGCTGCGGCCGGAGATCTGGGATGAGTTCA 100O 

1010 1020 1030 1040 

« ■ I T 1 I I 1 * | 1 I 1 t 1 1 I 1 t 1 > t I t I tilt! I I f * ) 1 1 ' ' 1 .. 

CCACCCGCTTCGGGGTCGCGCGGGTGTGCGAGTTCTACGC 1040 
CGCCAGCGAAGGCAACTCGGCC7TTATCAACATCTTCAAC 108O 
GTGCCCAGGACCGCCGC-GGTATCGCCGATGCCGCTTGCCT 1 120 
TTGTGGAATACGACCTGGACACCGGCGATCCGCTGCGGGA 1 160 
TGCGAGCGGGCG AGTGCGTCGGGTACCCGACGGTGAACCC 1200 

FIG. 88 A 
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1210 1220 123 
* * i * t tfitt iftti titii Ttiti ttiiiiitii tttti 

GGCCTGTTGCTTAGCCGGGTCAACCGGCTGCAGCCGTTCG 1240 

ACGGCTACACCGACCCGGTTGCCAGCGAAAAGAAGTTGGT 1280 

GCGCAACGCTTTTCGAGATGGCGACTGTTGGTTCAACACC 1320 

GGTGACGTGATGAGCCCGCAGGGCATGGGCCATGCCGCCT 1360 

TCGTCGATCGGCTGGGCGAC ACCTTCCGCTGGAAGGGCGA 1400 

1410 1420 1430 1440 

■ > ■ ■ i ' ■ ' ■ * 1 ■ ■ ' 1 ■ 1 ■ ■ l ■ ■ ■ ' 1 ■ 1 ■ ■ * * ' ■ 1 1 ■ 1 ' 1 i 

GAATGTCGCCACCACTCAGGTCGAAGCGGCACTGGCCTCC 1440 
GACCAGACCGTCGAGGAGTGCACGGTCTACGGCGTCCAGA 1480 
TTCCGCGCACCGGCGGGCGCGCCGGAATGGCCGCGATCAC 1520 
ACTGCGCGCTGGCGCCGAATTCGACGGCCAGGCGCTGGCC 1560 
CGAACGGTTTACGGTCACTTGCCCGGCTATGCACTTCCGC 1600 

1610 1620 1630 1640 

■ ■ ■ ' i ■ ■ ■ ' I ■ ■ ' ■ 1 ■ ■ ' ' 1 ' ■ ■ ' 1 ' ■ > » l ■ ■ ■ ■ * ' ' ■ ' I 

TCTTTGTTCGGGTAGTGGGGTCGCTGGCGCACACCACGAC 1640 
GTTCAAGAGTCGCAAGGTGGAGTTGCGCAACCAGGCCTAT 1 680 
GGCGCCGACATCGAGGATCCGCTGTACGTACTGGCCGGCC 1720 
CGGACGAAGGATATGTGCCGTACTACGCCGAATACCCTGA 1760 
GGAGGTTTCGCTCGGAAGGCGACCGCAGGGCTAG 1794 

FIG. 88B 
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1 ■ ' ■ 1 ■ ' ' ' l • ■ ' 1 1 ' ■ ■ ■ l ' < ' ■ i i i i ■ I i i i ■ i t i i i I 

MSDYYGGAHTTVRL I DLATRMPRVLADTP V I VRGAMTGLL 40 
ARPNSKAS I GTVFQDRAAR YGORVFLKFGDGQLTYRDANA 80 
TANRYAAVLAARGVGPGD VVG I MLRNSPSTVLAMLATVKC 120 
GA I AGMLNYHQRGEVLAHSLGLLDAKVL IAE5DLVSAVAE 160 
CGASRGRVAGDVLTVEDVERFATTAPATNPASASAVQAKD 200 

210 220 230 240 
' ■ ■ ' i ■ ■ ■ ■ I ■ 1 ■ 1 1 ■ ■ ■ ' i ' ■ ■ ■ i ■ > t i I i i ■ i i i i t < I 

TAFY I FTSGTTGFPKASVMTHHRWLRALAVFGGMGLRLKG 240 
SDTLYSCLPLYHNNALTVAVSSV I NSGATLALGKSFSASR 280 
FWDEV I ANRATAFVY I GE I CRYLLNQPAKPTDRAHQ VRV I 320 
CGNGLRPE I WDEFTTRFGVARVCEFYAASEGNSAF I N I FN 360 
VPRTAGVSPMPLAFVEYDLDTGDPLRDASGRVRRVPDGEP 400 

410 . 420 430 440 
i i > i i i i t > 1 ■ i i i I ? i. i i i i i i i i ■ ■ i t i i i , , i , , , , i 

GLLLSRVNRLQPFDGYTDP V ASEKKLVRNAFRDGDCWFNT 440 
GDVMSPQGMGHAAFVDRLGDTFRWKGENVATTQVEAALAS 480 
DQTVEECTVYGVQ I PRTGGRAGMAA I TLRAGAEFDGQALA 520 
RTVYGHLPGYALPLFVR VVGSLAHTTTFKSRKVELRNQAY 560 
GAD I EDPLYVLAGPDEGYVPYYAEYPEEVSLGRRPQG 597 
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BNSDOCID: <WO 0l217fiSA3 IA* 



164/202 




SUBSTITUTE SHEET (RULE 2§® 
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1 cga ccc acg cgt 
1 

31 ggc tgg aac cag 
5 G W N Q 

61 gcg ggc tec atg 
15 A G S M 

91 ccc ctg ctg ctg 
25 P L L L 

121 ctg ctg aag eta 
35 L L K L 

151 cgc tgg ctt ccg 
45 R W L P 

181 gtg cga get ctg 
55 V R A L 

211 cga get cgc gec 
65 R A R A 

241 gac ccg gaa ggt 
75 D P E G 



ccg ggg atg ttt gcg age 
M F A S 

acg gtg ccg ata gag gaa 
T V P I E E 

get gee etc ctg ctg ctg 
A A L L L L 

ttg eta ccg ctg ctg ctg 
L L P L L L 

cac etc tgg ccg cag ttg 
H L W P Q L 

gcg gac ttg gee ttt gcg 
A D L A F A 

tgc tgc aaa agg get ctt 
C C K R A L 

ctg gec gcg get gee gee 
L A A A A A 

ccc gag ggg ggc tgc age 
P E G G C S 
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271 ctg gcc tgg cgc etc gcg gaa ctg gec cag 

85 LAW RLAELAQ 

301 cag cgc gcc gcg cac acc ttt etc att cac 

95 QR'AAHTFLIH 

331 ggc teg egg cgc ttt age tac tea gag gcg 

105 GSRRFSYSEA 

361 gag cgc gag agt aac agg get gca cgc gcc 

115 ERESNRAARA 

391 ttc eta cgt gcg eta ggc tgg gac tgg gga 

125 FLRALGW DWG 



421 ccc gac ggc ggc gac age ggc gag ggg age 

135 PDGGDSGEGS 

451 get gga gaa ggc gag egg gca gcg ccg gga 

145 AGEGERAAPG 

481 gcc gga gat gca gcg gcc gga age ggc gcg 

155 AG D A A "A G SGA 

521 gag ttt gcc gga ggg gac ggt gcc gcc aga 

165 EFAGGDGAAR 

541 ggt gga gga gag ccc gcc gcc cct ctg tea FIG. 94C 

175 GGGEPAAPLS 

571 cct gga gca act gtg gcg ctg etc etc ccc 

185 PGATVALLLP 

601 get ggc cca gag ttt ctg tgg etc tgg ttc 

195 AGPEFLWLWF 



SUBSTITUTE SHEET (RULE 26) 
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631 ggg ctg gcc aag 
205 G L' A K 

661 ttt gtg ccc acc 
215 F V P T 

691 ctg ctg cac tgc 
225 L L H C 

721 cgc gcg ctg gtg 
235 R A L V 

7 51 gag tec ctg gag 
245 E S L E 

781 • aga gcc atg ggg 
255 R A M G 

811 ggc cca gga acc 
265 G P G T 

841 gat ttg ctg get 
275 D L L A 



gcc ggc ctg cgc act gcc 
A G L R T A 

gcc ctg cgc egg ggc ccc 
A L R R G P 

etc cgc age tgc ggc gcg 
__L R S C G A 

ctg gcg cca gag ttt ctg 
L A P E F L 

ccg gac ctg ccc gcc ctg 
P D L P A L 

etc cac ctg tgg get gca 
L H L W A A 

cac cct get gga att age 
H P A G I S 

gaa gtg tec get gaa gtg 
E V S A E V 



FIG. 94D 
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871 gat ggg cca gtg 
285 D G P V 

901 ccc cag age ata 
295 P Q S I 

931 ate ttc ace tct 
305 I F T S 

961 aag get get egg 
315 K A A R 

991 ctg caa tgc cag 
325 L Q C Q 

1021 ggt gtc cac cag 

335 G V H Q 



cca gga tac etc tct tec 
P G Y L S S 

aca gac acg tgc ctg tac 
T D T C L Y 

ggc acc acg ggc etc ccc 
G T T G L P 

ate agt cat ctg aag ate 
I S H L K I 

ggc ttc tat cag ctg tgt 
G F Y Q L C 

gaa gat gtg ate tac etc 
E D V I Y L 



FIG. 94E 
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1051 gcc etc 

345 A L 

1081 ctg ctg 

355 l L 

1111 ggg gcc 

365 G A 

1141 teg get 

375 S A 

1171 cag cac 

385 Q h 

1201 ggg gag 

395 G E 

1231 ccc ccg 

405 p p 

1261 gtc egg 

415 V R 
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allowing a meaningful search, these claims have not been searched* 
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